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Crystals of DPP-IV 

Background of the Invention 

[0001] Dipeptidyl peptidase (DPP-IV; T-cell activation antigen CD26 or adenosine 
binding protein) is a multifunctional type II cell surface glycoprotein. The 
protein is widely expressed in a variety of cell types, particularly on differential 
epithelial cells of the intestine, liver, prostate tissue, corpus luteum, and kidney 
proximal tubules (Hartel, S., Gossrau, R., Hanski, C. & Reutter, W. (1988). 
Dipeptidyl peptidase (DPP) IV in rat organs. Comparison of 
immunohistochemistry and activity histochemistry. Histochemistry 89, 151-161; 
McCaughan, G.W., Wickson, J.E., Creswick, P.F. & Gorrell, M.D. (1990). 
Identification of the bile canalicular cell surface molecule GP1 10 as the 
ectopeptidase dipeptidyl peptidase IV: an analysis by tissue distribution, 
purification and N-terminal amino acid sequence. Hepatology 11, 534-544) as 
well as leukocyte subsets (Gorrell, M.D., Wickson, J. & McCaughan, G.W. 
(1991). Expression of the rat CD26 antigen (dipeptidyl peptidase IV) on 
subpopulations of rat lymphocytes. Cell Immunol 134, 205-215), such as T- 
helper lymphocytes, and subsets of macrophages (Buhling, F., Kunz, D., 
Reinhold, D., Ulmer, A.)., Ernst, M., Flad, H.D. 8c Ansorge, S. (1994). 
Expression and functional role of dipeptidyl peptidase IV (CD26) on human 
natural killer cells. Nat. Immun. 13, 270-279) and a soluble form is reported to 
be present in plasma and urine (Iwaki-Egawa, S., Watanabe, Y., Kikuya, Y. & 
Fujimoto, Y. (1998). Dipeptidyl peptidase IV from human serum: purification, 
characterization, and N-terminal amino acid sequence. /. Biochem. 124, 428- 
433). Human DPP-IV has a short cytoplasmatic tail of six amino acids, a 22 
amino acid hydrophobic transmembrane region and a 738 amino acid 
extracellular domain with ten potential glycosylation sites (Tanaka, T., 
Camerini, D., Seed, B., Torimoto, Y., Dang, N.H., Kameoka, J., Dahlberg, H.N., 
Schlossman, S.F. & Morimoto, C. (1992). Cloning and functional expression of 
the T cell activation antigen CD26. /. Immunol 149, 481-486). 

[0002] DPP-IV is involved in many biological processes, including a membrane- 
anchoring function for the localization of the extracellular enzyme adenosine 
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deaminase (ADA) (Franco, R., Valenzuela, A., Lluis, C. & Blanco, J. (1998). 
Enzymatic and extraenzymatic role of ecto-adenosine deaminase in 
lymphocytes. Immunol Rev. 161, 27-42), participation in cell matrix adhesion by 
binding to collagen and fibronectin (Loster, K., Zeilinger, K., Schuppan, D. & 
Reutter, W. (1995). The cysteine-rich region of dipeptidyl peptidase IV (CD 26) 
is the collagen-binding site. Biochem. Biophys. Res. Commun. 217, 341-348), 
interaction as a co-receptor for the HIV envelope protein gp 120 (Ohtsuki, T., 
Tsuda, H. 8c Morimoto, C. (2000). Good or evil: CD26 and HIV infection. /. 
Dermatol Sci. 22, 152-160) and co-stimulatory function during T-cell activation 
and proliferation (von Bonin, A., Huhn, J. 8c Fleischer, B. (1998). Dipeptidyl- 
peptidase IV/CD26 on T cells: analysis of an alternative T-cell activation 
pathway. Immunol Rev. 161, 43-53) by interaction with the protein tyrosine 
phosphatase (CD45) (Torimoto, Y., Dang, N.H., Vivier, E., Tanaka, T., 
Schlossman, S.F. 8c Morimoto, C. (1991). Coassociation of CD26 (dipeptidyl 
peptidase IV) with CD45 on the surface of human T lymphocytes. /. Immunol 
147, 2514-2517). 

[0003] DPP-IV (EC 3.4.14.5) has postproline dipeptidyl amino peptidase activity, 

preferentially cleaving X-proline or X-alanine dipeptides from the N-terminus of 
polypeptides (Hopsu-Havu, V.K. 8c Glenner, G.G. (1966). A new dipeptide 
naphthylamidase hydrolyzing glycyl-prolyl-beta-naphthylamide. Histochemie 7, 
197-201.) and belongs to the prolyl oligopeptidase family, a group of atypical 
serine proteases able to hydrolyse the prolyl bond (Cunningham, D.F. 8c 
O'Connor, B. (1997). Proline specific peptidases. Biochim. Biophys. Acta 1343, 
160-186). It possesses a novel orientation of its catalytic triad residues (Ser-Asp- 
His) (Ikehara, Y., Ogata, S. 8c Misumi, Y. (1994). Dipeptidyl-peptidase IV from 
rat liver. Methods Enzymol 244, 215-227.), inverse to that found in classical 
serine proteases (His-Asp-Ser). The cleavage of N-terminal peptides with Pro in 
the second position is a rate limiting step in the degradation of peptides. The 
natural substrates of DPP-IV include several chemokines, cytokines, 
neuropeptides, circulating hormones and bioactive peptides (Lambeir, A.M., 
Durinx, C, Proost, P., Van Damme, J., Scharpe, S. 8c De Meester, I. (2001). 
Kinetic study of the processing by dipeptidyl-peptidase IV/CD26 of 
neuropeptides involved in pancreatic insulin secretion. FEBS Lett. 507, 327- 
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330.). The wide range of substrates suggests a key regulatory role in the 
metabolism of peptide hormones and in amino acid transport (Hildebrandt, M., 
Reutter, W., Arck, P., Rose, M. & Klapp, B.F. (2000). A guardian angel: the 
involvement of dipeptidyl peptidase IV in psychoneuroendocrine function, 
nutrition and immune defence. Clin Sci 99, 93-104). Its physiological relevance 
has been investigated by (Hinke, S.A., Pospisilik, J.A., Demuth, H.U., Mannhart, 
S., Kuhn-Wache, K., Hoffmann, T., Nishimura, E., Pederson, R. A. & Mcintosh, 
C.H. (2000). Dipeptidyl peptidase IV (DPIV/CD26) degradation of glucagon. 
Characterization of glucagon degradation products and DPIV-resistant analogs. 
/. Biol Chem. 275, 3827-3834). 

[0004] The finding that DPP-IV is responsible for more than 95% of the degradation of 
GLP-1 led to an elevated interest in inhibition of this enzyme for the treatment 
of diabetes type II. Experiments in rats and humans have provided evidence that 
specific DPP-IV inhibition increased C ma x> Tin and total circulating GLP-1 and 
decreased plasma glucose. It has been demonstrated that patients with impaired 
glucose- tolerance (IGT), type-II diabetes and with a secondary failure to 
respond to sulfonylurea treatment benefit from increased levels of GLP1 
peptides. In addition GLP-1 is effective in type-I diabetic patients due to its 
glucagono-static effect. More recent investigations show a delay of gastric 
emptying that could have beneficial effects on satiety and might be relevant for 
the treatment of obesity. Protection of functional GLP-1 by inhibition of DPP- 
IV and concomitant activation of the GLP-1 receptor might therefore have a 
synergistic potential in anti-diabetic drug research (Hoist, J.J. 8c Deacon, C.F. 
(1998). Inhibition of the activity of dipeptidyl -peptidase IV as a treatment for 
type 2 diabetes. Diabetes 47, 1663-1670.). Selective and orally available small 
molecule inhibitors of DPP-IV have been discovered and are now in clinical 
trials (Villhauer, E.B., Brinkman, J.A., Naderi, G.B., Dunning, B.E., Mangold, 
B.L., Mone, M.D., Russell, M.E., Weldon, S.C. & Hughes, T.E. (2002). l-[2-[(5- 
Cyanopyridin-2-yl)amino]ethylamino]acetyl-2-(S)-pyrrolidinecarbon nitrile: a 
potent, selective, and orally bioavailable dipeptidyl peptidase IV inhibitor with 
antihyperglycemic properties. /. Med. Chem. 45, 2362-2365; Pospisilik, J.A., 
Stafford, S.G., Demuth, H.U., Mcintosh, C.H. 8c Pederson, R.A. (2002). Long- 
term treatment with dipeptidyl peptidase IV inhibitor improves hepatic and 



peripheral insulin sensitivity in the VDF zucker rat: a euglycemic- 
hyperinsulinemic clamp study. Diabetes 51, 2677-2683). 

Summary of the Invention 

[0005] The present invention provides a crystal of the extracellular domain of 

mammalian DPP-IV wherein the crystal has an orthorhombic space group of 
P2i2i2i and one homodimer of DPP-IV in the asymmetric unit. 

[0006] The crystal of the present invention has unit cell dimensions of: 
a is from 63 A to 70 A; 
b is from 66 A to 70 A; 
c is from 416 A to 424 A; 
and a P2i2i2i symmetry. 

[0007] Also provided is a co-crystal which includes a ligand bound to the active site of 
mammalion DPP-IV. 

[0008] The invention permits the identification or design of inhibitor compounds of 
DPP-IV activity for use in treatment of type II diabetes. 

Brief Description of the Figures 

[0009] The patent or application file contains at least one dr awing executed in color. 

Copies of this patent or patent application publication with color drawing(s) will 
be provided bv the Office upon request and payment of the ne cessary fee. 

[0010] Figure 1. Sequence alignment of DPP-IV and POP: Amino acid sequence 

alignment of DPP-IV from human (hDPP-IV) and rat (rDPP-IV, only different 
residues are shown). The alignment of POP from pork was performed using 
structural superposition for the a/p-hydrolase domain only, because of a lack of 
structural homology for the P-propeller domain. The top line gives additional 
information about the secondary structure of DPP-IV (yellow arrows and red 
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bars), the glycosylation sites with visible electron density (Y), the potential 
glycosylation sites (marked in red), the disulphide bonds (green lines between 
cysteins that are involved) and an arrow that indicates the start of the cloned 
ectodomain. Sequences are highlighted light gray for the transmembrane part, 
gray for the part of the P-propeller involved in dimerization, green for residues 
involved in adenosine deaminase binding, blue for the tyrosine that is involved 
in the stabilization of the oxyanion of the catalytic intermediate and pink for the 
catalytic residues. 

[001 1 ] Figure 2. Overall Structure of DPP-IV: Ribbon diagram of DPP-IV viewed 

perpendicular to the two-fold axis. The domains are colored dark green and light 
green for the cc/p hydrolase and P-propeller domains of subunit A and dark/light 
blue for the other subunit, respectively. The overall dimension of the molecule is 
about 125 x 80 x 60 A 3 . The active site is highlighted by the catalytic residues in 
ball and stick representation as well as residues that are identified by 
mutagenesis data to be important for ADA binding. The proposed location at 
the cell surface is shown by the schematic drawing of the membrane. This figure 
was prepared using Molscript (Kraulis, P.J. (1991). MOLSCRIPT: A program to 
produce both detailed and schematic plots of protein structures. /. Applied 
Crystallogr. 24, 946-950) and rendered with Raster3D (Merrit, E.A. & Bacon, DJ. 
(1997). Raster3D: photorealistic molecular graphics. Methods Enzymol 277, 505- 
524). 

% [0012] Figure 3. Ribbon drawing of the P-propeller domains of DPP-IV and POP: 

A: DPP-IV has 8 repeats of a structural motif that consists of four antiparallel p- 
strands or blades (blades are numbered 1 to 8). Additional secondary structural 
elements are colored magenta: An antiparallel p-sheet (p2/2a and p2/2b in 
Figure 1) that is an extension of blade 2 with Argl25 at the tip of the turn that is 
involved in the substrate binding. An a-helix (a2* in Figure 1) with the C- 
terminal glutamate rich loop that contributes to substrate recognition and 
specificity (Glu204/205/206). The antiparallel P-sheet that forms a main part of 
the dimer interface (pi* and p2* in Figure 1). The latter structural elements are 
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extensions of the blade 4. 

B: p-propeller domain of DPP-IV rotated 90° 

C: POP has 7 blades and no notable deviations from the P-propeller structure. 
The blades are numbered 1 to 7. 

[0013] Figure 4. Access to the active site: Schematic view on the subunit of DPP-IV 

with the active site surface coloured according to the atom types. The substrate 
Diprotin A is shown with white carbons indicating the substrate binding site. 
Arrows illustrate that the substrate may enter the active site at the well accessible 
and open active site cleft and the dipeptidic product of the catalytic reaction may 
leave the active site cavity via the more narrow tunnel that is formed by the 0- 
propeller. 

[0014] Figure 5. Active site of DPP-IV with Diprotin A (Ile-Pro-Ile): The substrate 

Diprotin A is trapped as tetrahedral intermediate covalently bound to the active 
site Ser630. Dashed lines indicate hydrogen bonds. Bonds are dark blue for the 
protein and light blue for the ligand as well as the active site Ser630. Drawn with 
MOLOC (Gerber, P.R. (1992). Peptide mechanics: a force field for peptides and 
proteins working with entire residues as small unites. Biopolymers 32, 1003- 
1017).The insert shows the omit electron density (ligand and Ser630 were 
omitted from the calculations) contoured at 2.5 a (green) and 4 a (yellow). 

Detailed Description of the Invention 

[0015] The present invention relates to crystals of mammalian DPP-IV, with or without 
a ligand bound in the active site, where the crystals are of sufficient quality and 
size to allow for the determination of the three-dimensional X-ray diffraction at 
atomic resolution. The invention also relates to methods for producing and 
crystallizing the mammalian DPP-IV. The crystals of mammalian DPP-IV, as 
well as information derived from their crystal structures can be used to analyze 
and modify mammalian DPP-IV activity as well as to identify compounds that 
interact with DPP-IV. 
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[0016] In one aspect the present invention provides a crystal of the extracellular domain 
of mammalian DPP-IV, preferably having the orthorhombic space group 
symmetry P2 1 2 1 2 l and one homodimer of DPP-IV in the asymmetric unit. 
Preferably, the crystal includes a unit cell having dimensions a, b, and c; wherein 
a is from 63 A to 67 A, b is from 66 A to 70 A, and c is from 416 A to 424 A; and a 
= p = y = 90°. Preferably, the crystal includes atoms arranged in a spatial 
relationship represented by the atomic structure coordinates listed in Table 4. 
Preferably, the crystal includes DPP-IV comprising the amino acid sequence 
from Gly31 to Pro766 of the native protein as well as shorter variants thereof 
comprising all amino acids necessary for forming the active site. Preferably, the 
crystal includes DPP-IV as set forth in SEQ ID NO:2 as well as shorter variants 
thereof comprising all amino acids necessary for forming the active site. 

[0017] The crystals of the invention include apo crystals and co-crystals. The apo 

crystals of the invention refer to crystals of mammalian DPP-IV formed without 
a bound active site or allosteric ligand. The co-crystals generally comprise DPP- 
IV with a ligand bound to the active site or to an allosteric site. The "active site" 
refers in general to the site where the enzymatic reaction catalyzed by the 
enzyme takes place. An active site ligand refers to any compound which 
specifically binds to the active site of a mammalian DPP-IV. 

[0018] Preferably, the co-crystal of the present invention is characterized as having an 
orthorhombic space group of P2i2i2i (space group No. 19) and one homodimer 
of DPP-IV in the asymmetric unit. 

[0019] More preferably, the co-crystal has unit cell dimensions of a is from 63 A to 67A, 
b is from 66 A to 70 A, and c is from 416 A to 424 A.; and a = P = y = 90° and a 
P2i2i2i symmetry. 

[0020] The co-crystals of the invention generally comprise a crystalline DPP-IV 

polypeptide in association with one or more compounds at an active or allosteric 
binding site of the polypeptide. The association may be covalent or non- 
covalent. 
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[0021] The DPP-IV (dipeptidyl-peptidase, DPP-IV; T-cell activation antigen CD26 or 
adenosine binding protein) of the present invention may be a mammalian DPP- 
IV. Preferably, the DPP-IV of the present invention is a human DPP-IV. More 
preferably, the DPP-IV of the present invention is the extracellular domain of 
DPP-IV. Even more preferred is the extracellular domain of DPP-IV which is 
soluble. Most preferably, the human DPP-IV comprises the amino acid sequence 
from Gly31 to Pro766 of the native protein as well as shorter variants thereof 
comprising all amino acids necessary for forming the active site. Preferably, 
DPP-IV comprises the amino acid sequence as set forth in SEQ. ID NO:2 as well 
as shorter variants thereof comprising all amino acids necessary for forming the 
active site. 

[0022] It is to be understood that the crystals of DPP-IV of the invention are not limited 
to naturally occurring or native DPP-IV. Indeed, the crystals of the invention 
include mutants of the native DPP-IV. Mutants of native DPP-IV are obtained 
by replacing at least one amino acid residue in a native DPP-IV domain with a 
different amino acid residue, or by adding or deleting amino acid residues 
within the native polypeptide or at the N- or C- terminus of the native 
polypeptide, and have substantially the same three-dimensional structure as the 
native DPP-IV from which the mutant is derived. 

[0023] By having substantially the same three-dimensional structure is meant having a 
set of atomic structure coordinates from an apo- or co-crystal that have a root 
mean square deviation of less than or equal to about 1.5 A when superimposed 
with the atomic structure coordinates of the native DPP-IV when at least 50% of 
the alpha carbon atoms of DPP-IV are included in the superposition. 

[0024] In some instances, it may be particularly advantageous or convenient to 

substitute, delete and/or add amino acid residues to a native DPP-IV domain in 
order to provide convenient cloning sites in cDNA encoding the polypeptide, to 
aid in purification of the polypeptide, etc. Such substitutions, deletions and/or 
additions which do not substantially alter the three dimensional structure of the 
native DPP-IV will be apparent to those having skills in the art. 



[0025] It should be noted that the mutants contemplated herein need not exhibit DPP- 
IV activity. Indeed, amino acid substitutions, additions or deletions that 
interfere with the peptidase activity of the DPP-IV but which do not significantly 
alter the three-dimensional structure of the domain are specifically 
contemplated by the invention. Such crystalline polypeptides, or the atomic 
structure coordinates obtained therefrom, can be used to identify compounds 
that bind to the native domain. These compounds may affect the activity or the 
native domain. 

[0026] The derivative crystals of the invention generally comprise a crystalline DPP-IV 
polypeptide in covalent association with one or more heavy metal atoms. The 
polypeptide may correspond to a native or a mutated DPP-IV. Heavy metal 
atoms useful for providing derivative crystals include, by way of example and 
not limitation, gold and mercury. Alternatively, derivative crystals can be 
formed from proteins which have heavy atoms incorporated into one or more 
amino acids, such as seleno-methionine substitutions for methionine. 

[0027] Therefore, in a preferred embodiment of the present invention the co-crystal is a 
co-crystal of the extracellular domain of mammalian DPP-IV and HgC^- 

[0028] The native and mutated DPP-IV polypeptides described herein may be isolated 
from natural sources or produced by methods well known to those skilled in the 
art of molecular biology. Expression vectors to be used may contain a native or 
mutated DPP-IV polypeptide coding sequence and appropriate transcriptional 
and/or translational control signals. These methods include in vitro 
recombinant DNA techniques, synthetic techniques and in vivo 
recombination/genetic recombination. See, for example, the techniques 
described in Maniatis et al., 1989, Molecular Cloning: A Laboratory Manual^ Cold 
Spring Harbor Laboratory, NY; and Ausubel et al., 1989, Current Protocols in 
Molecular Biology , Greene Publishing Associates and Wiley Interscience, NY. 



[0029] A variety of host-expression vector systems maybe utilized to express the DPP- 
IV coding sequence. These include but are not limited to microorganisms such 
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as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or 
cosmid DNA expression vectors containing the DPP-I V coding sequence; yeast 
transformed with recombinant yeast expression vectors containing the DPP-IV 
coding sequence; insect cell systems infected with recombinant virus expression 
vectors (e.g. baculovirus) containing the DPP-IV coding sequence; plant cell 
systems infected with recombinant virus expression vectors (e.g., cauliflower 
mosaic virus, CaMV; tobacco mosiac virus, TMV) or transformed with 
recombinant plasmid expression vectors (e.g., Ti plasmid) containing the DPP- 
IV coding sequence; or animal cell systems. The expression elements of these 
systems vary in their strength and specificities. Depending on the host/vector 
system utilized, any of a number of suitable transcription and translation 
elements, including constitutive and inducible promoters such as pL of 
bacteriophage |i, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like may be 
used; when cloning in insect cell systems, promoters such as the baculovirus 
polyhedrin promoter may be used; when cloning in plant cell systems, 
promoters derived from the genome of plant cells (e.g., heat shock promoters; 
the promoter for the small subunit of RUBISCO; the promoter for the 
chlorophyll a/b binding protein) or from plant viruses (e.g., the 35 S RNA 
promoter of CaMV; the coat protein promoter of TMV) may be used; when 
cloning in mammalian cell systems, promoters derived from the genome of 
mammalian cells (e.g., metallothionein promoter) or from mammalian viruses 
(e.g., the adenovirus late promoter; the vaccinia virus 7.5K promoter) may be 
used; when generating cell lines that contain multiple copies of the DPP-IV 
coding sequence, SV40-, BPV- and EBV-based vectors may be used with an 
appropriate selectable marker. 

[0030] In a preferred embodiment of the present invention, an isolated nucleic acid 

sequence encoding the soluble extracellular domain of DPP-IV comprising the 
nucleotide sequence of SEQ ID NO:l is provided. 

[0031] Additionally, an expression vector containing an isolated nucleic acid sequence 
encoding the soluble extracellular domain of DPP-IV comprising the nucleotide 
sequence of SEQ ID NO:l is provided. Preferably, the expression vector for the 
expression of proteins in P. pastoris which are to be secreted. Furthermore, a 
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host cell transformed with the said expression vector is provided. Preferably, the 
host cell is Pichia pastoris. 

[0032] A further aspect of the present invention relates to a method of producing the 
soluble extracellular domain of DPP-IV comprising culturing the host cell with 
the said expression vector under conditions permitting the expression of the 
soluble extracellular domain of DPP-IV by the host cell. Preferably, the host cell 
is P. pastoris. The present invention also provides the soluble extracellular 
domain of DPP-IV produced by this method. 

[0033] Furthermore, the present invention relates to a polypeptide comprising the 
soluble extracellular domain of DPP-IV as set forth in SEQ ID NO:2. 

[0034] The apo-, derivative and co-crystals of the invention can be obtained by 

techniques well-known in the art of protein crystallography, including batch, 
liquid bridge, dialysis, vapor diffusion and hanging drop methods (see e.g. 
McPherson, 1982, Preparation and Analysis of Protein Crystals> John Wiley, NY; 
McPherson, 1990, Eur. J. Biochem. 189:1-23; Webber, 1991, Adv. Protein Chem. 
41:1-36; Crystallization of Nucleic Acids and Proteins, Edited by Arnaud 
Ducruix and Richard Giege, Oxford University Press; Protein Crystallization 
Techniques, Strategies, and Tips, Edited by Terese Bergfors, International 
University Line, 1999). Generally, the apo- or co-crystals of the invention are 
grown by placing a substantially pure DPP-IV polypeptide in an aqueous buffer 
containing a precipitant at a concentration just below that necessary to 
precipitate the protein. Water is then removed from the solution by controlled 
evaporation to produce crystallizing conditions, which are maintained until 
crystal growth ceases. 

[0035] Preferably, the crystals are produced by a method for crystallizing mammalian 
DPP-IV, the method comprising (a) providing a buffered, aqueous solution of 
pH 7 to 8.5 with a concentration of 7 mg/ml to 22 mg/ml of the extracellular 
domain of mammalian DPP-IV; and (b) growing crystals by vapor diffusion 
using a buffered reservoir solution with between 10% and 30% PEG, between 
10% and 20% glycerol, wherein PEG has an average molecular weight between 



1000 and 20000. More preferably, the extracellular domain of mammalian DPP- 
IV of step (a) of the method is produced in the yeast Pichia pastoris (P. pastoris) 
and then deglycosylated. For deglycosylation, different enzymes maybe used 
comprising Endoglycosidase F or PNGase. 

[0036] Preferably, co-crystals are produced by a method for co-crystallizing mammalian 
DPP-IV and an active site ligand, the method comprising (a) providing a 
buffered, aqueous solution of pH 7 to 8.5 with a concentration of 7 mg/ml to 22 
mg/ml of the extracellular domain of mammalian DPP-IV; (b) adding a molar 
excess of the active site ligand to the aqueous solution of mammalian DPP-IV; 
(c) growing crystals by vapor diffusion using a buffered reservoir solution with 
between 10% and 30% PEG, between 10% and 20% glycerol, wherein PEG has 
an average molecular weight between 1000 and 20000. More preferably, the 
extracellular domain of mammalian DPP-IV of step (a) of the method is 
produced in P. pastoris and then deglycosylated. 

[0037] A further aspect of the present invention relates to a crystal produced by the 
methods for crystallizing or co-crystallizing DPP-IV of the present invention. 

[0038] Crystals maybe frozen prior to data collection. 

[0039] The mosaic spread of the frozen crystals could sometimes be reduced by 

annealing, wherein the stream of cold nitrogen gas is briefly blocked, allowing 
the frozen crystal to thaw momentarily before re-freezing in the nitrogen gas 
stream. 

[0040] Diffraction data typically extending to 2.7 A was collected from the frozen 

crystals at the synchrotron beamline x06 at the Swiss light source (SLS), Villigen 
Switzerland. Under optimum conditions, data extending to 2.1 A was recorded. 
Preferably, the the data is collected at a resolution of 3.5 A to 2.1 A or better. 
More preferably, the data is collected at a resolution of 2.7 A to 2.1 A or better. 
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[0041] Derivative crystals of the invention can be obtained by soaking apo or co-crystals 
in mother liquor containing salts of heavy metal atoms, according to procedures 
known to those of skill in the art of X-ray crystallography. 

[0042] Co-crystals of the invention can be obtained by soaking an apo crystal in mother 
liquor containing a ligand that binds to the active site, or can be obtained by co- 
crystallizing the DPP-IV polypeptide in the presence of one or more ligands that 
bind to the active site or to an allosteric site. Preferably, co-crystals are formed 
with an active site DPP-IV ligand which is slowly hydrolysable and forms a 
covalent bond. One example for such an active site ligand is Diprotin A. 

[0043] In a further embodiment of the present invention a method for determining the 
three-dimensional structure of a crystallized extracellular domain of mammalian 
DPP-IV to a resolution of 3.5 A to 2.1 A or better is provided, the method 
comprising 

(a) crystallizing an extracellular domain of mammalian DPP-IV; and 

(b) analyzing the extracellular domain of mammalian DPP-IV by X-ray 
diffraction to determine the three-dimensional structure of the crystallized 
extracellular domain of mammalian DPP-IV, whereby the three-dimensional 
structure of a crystallized extracellular domain of mammalian DPP-IV is 
determined to a resolution of about 3.5 A to 2.1 A or better. 

[0044] The present invention further relates to a machine-readable data storage 

medium comprising a data storage material encoded with machine readable data 
which, when using a machine programmed with instructions for using said data, 
displays a graphical three-dimensional representation of a molecule or molecular 
complex comprising at least a portion of the extracellular domain of mammalian 
DPP-IV comprising the amino acids of SEQ ID NO:2, the extracellular domain 
comprising the ligand binding active site being defined by a set of points having 
a root mean square deviation of less than about 1.5A from points representing 
the backbone atoms of said amino acids as represented by structure coordinates 
listed in Table 4. 
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[0045] The crystals of the invention, and particularly the atomic structure coordinates 
obtained therefrom, have a wide variety of uses. For example, the crystals and 
structure coordinates described herein are particularly useful for identifying 
compounds that interact with DPP-IV as an approach towards developing new 
therapeutic agents. Pharmaceutical compositions of said compounds can be 
developed, and said compounds can be used for the manufacture of a 
medicament comprising said compound for the treatment of IGT, type I and 
type II diabetes, obesity and cancer. 

[0046] Therefore, the present invention also relates to the use of a crystal or a co-crystal 
of the invention for the identification and/or design of inhibitors of DPP-IV 
activity. 

[0047] Moreover, the present invention relates to a method for identifying a compound 
that interacts with DPP-IV, comprising the steps of 

(a) generating a three-dimensional model of DPP-IV using the structure 
coordinates listed in Table 4, a root mean square deviation from the backbone 
atoms of said amino acids of less than 1.5A; and 

(b) employing said three-dimensional model to design or select a compound 
that interacts with DPP-IV. 

[0048] In another aspect, the method further comprises the steps of 

(c) obtaining the identified compound; and 

(d) contacting the obtained compound with DPP-IV in order to determine the 
effect the compound has on DPP-IV activity. 

[0049] The compound in these methods may be a compound that interacts with the 

active site of DPP-IV or may be a compound that interacts with an allosteric site 
of DPP-IV. Preferred are compounds which interact with the active site of DPP- 
IV. Even more preferred are compounds, which show an inhibitory effect on 
DPP-IV activity in step (d) of the methods of the present invention. 

[0050] In a further aspect of the present invention the method for identifying a 
compound that interacts with DPP-IV is a computer-assisted method. 
Preferably, determining whether the compound is expected to bind to or 
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interfere with the molecule or molecular complex includes performing a fitting 
operation between the compound and a binding site or substrate binding surface 
of the molecule or molecular complex, followed by computationally analyzing 
the results of the fitting operation to quantify the association between, or the 
interference with, the compound and the binding site. Optionally, the method 
further includes screening a library of compound. Optionally, the method 
further includes supplying or synthesizing the compound, then assaying the 
compound to determine whether it interacts with and has an effect on 
mammalian DPP-IV activity. 

[0051] The present invention also relates to the compounds identified by the said 
methods for identifying a compound that interacts with DPP-IV. 

[0052] The structure coordinates described herein can be used as phasing models in 
determining the crystal structures of additional native or mutated DPP-IV, as 
well as the structures of co-crystals of such DPP-IV with active site inhibitors or 
activators bound. The structure coordinates, as well as models of the three- 
dimensional structures obtained therefrom, can also be used to aid the 
elucidation of solution-based structures of native or mutated DPP-IVs, such as 
those obtained via NMR. Thus, the crystals and atomic structure coordinates of 
the invention provide a convenient means for elucidating the structures and 
functions of DPP-IV or other prolyl oligopeptidases. 

[0053] For purposes of clarity and discussion, the crystals of the invention will be 

described by reference to specific DPP-IV exemplary apo crystals and co-crystals. 
Those skilled in the art will appreciate that the principles described herein are 
generally applicable to crystals of any mammalian DPP-IV, including, but not 
limited to DPP-IV. 

[0054] Increased levels of glucagon like peptide 1 (GLP1) are beneficial for the decrease 
of plasma glucose in humans. The finding that DPP-IV is responsible for more 
than 95% of the degradation of GLP-1 led to an elevated interest in inhibition of 
this enzyme for the treatment of diabetes type II. Experiments in rats and 
humans have provided evidence that specific DPP-IV inhibition increased C max > 
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Ti/2 and total circulating GLP-1 and decreased plasma glucose. It has been 
demonstrated that patients with impaired glucose-tolerance (IGT), type-II 
diabetes and with a secondary failure to respond to sulfonylurea treatment 
benefit from increased levels of GLP1 peptides. In addition GLP-1 is effective in 
type-I diabetic patients due to its glucagono-static effect. More recent 
investigations show a delay of gastric emptying that could have beneficial effects 
on satiety and might be relevant for the treatment of obesity. Protection of 
functional GLP-1 by inhibition of DPP-IV and concomitant activation of the 
GLP-1 receptor might therefore have a synergistic potential in anti-diabetic drug 
research (Hoist, JJ. & Deacon, C.F. (1998). Inhibition of the activity of 
dipeptidyl-peptidase IV as a treatment for type 2 diabetes. Diabetes 47, 1663- 
1670). Selective and orally available small molecule inhibitors of DPP-IV have 
been discovered and are now in clinical trials. 

[0055] Therefore, in a further aspect of the present invention a pharmaceutical 

composition comprising the compound identified by the methods of the present 
invention as having an effect on DPP-IV activity, or pharmaceutically acceptable 
salts thereof, and a pharmaceutically acceptable carrier is provided. 

[0056] The phrase "pharmaceutically acceptable" is employed herein to refer to those 

compounds, materials, compositions, and/or dosage forms which are, within the 
scope of sound medical judgment, suitable for use in contact with the tissues of 
human beings and animals without excessive toxicity, irritation, allergic 
response, or other problem or complication, commensurate with a reasonable 
benefit/ risk ratio. 

[0057] As used herein, "pharmaceutically acceptable salts" refer to derivatives of the 

disclosed compounds wherein the parent compound is modified by making acid 
or base salts thereof. Examples of pharmaceutically acceptable salts include, but 
are not limited to, mineral or organic acid salts of basic residues such as amines; 
alkali or organic salts of acidic residues such as carboxylic acids; and the like. The 
pharmaceutically acceptable salts include the conventional non-toxic salts or the 
quaternary ammonium salts of the parent compound formed, for example, from 
non-toxic inorganic or organic acids. For example, such conventional non-toxic 
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salts include those derived from inorganic acids such as hydrochloric, 
hydrobromic, sulfuric, sulfamic, phosphoric, nitric and the like; and the salts 
prepared from organic acids such as acetic, propionic, succinic, glycolic, stearic, 
lactic, malic, tartaric, citric, ascorbic, pamoic, maleic, hydroxymaleic, 
phenylacetic, glutamic, benzoic, salicylic, sulfanilic, 2-acetoxybenzoic, fumaric, 
benzenesulfonic, toluenesulfonic, methanesulfonic, ethane disulfonic, oxalic, 
isethionic, and the like. 

[0058] The pharmaceutical^ acceptable salts of the present invention can be 

synthesized from the parent compound which contains a basic or acidic moiety 
by conventional chemical methods. Generally, such salts can be prepared by 
reacting the free acid or base forms of these compounds with a stoichiometric 
amount of the appropriate base or acid in water or in an organic solvent, or in a 
mixture of the two; generally, nonaqueous media like ether, ethyl acetate, 
ethanol, isopropanol, or acetonitrile are preferred. Lists of suitable salts are 
found in Remington's Pharmaceutical Sciences, 17th ed., Mack Publishing 
Company, Easton, PA, 1985, p. 1418, the disclosure of which is hereby 
incorporated by reference. 

[0059] "Stable compound" and "stable structure" are meant to indicate a compound 

that is sufficiently robust to survive isolation to a useful degree of purity from a 
reaction mixture, and formulation into an efficacious therapeutic agent. 

[0060] Furthermore, a compound identified by the methods of the present invention as 
having an effect on DPP-IV activity for use as a therapeutic active substance, in 
particular for the treatment of diabetes type I, diabetes type II, IGT, obesity and 
cancer, is provided. 

[0061] A further aspect of the present invention relates to the use of a compound 

identified by the methods of the present invention as having an effect on DPP-IV 
activity for the manufacture of a medicament for the treatment of diabetes type- 
I, diabetes type-II, IG, obesity, and cancer. 
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[0062] Having now generally described this invention, the same will become better 

understood by reference to the specific examples, which are included herein for 
purpose of illustration only and are not intended to be limiting unless otherwise 
specified, in connection with the following figures. 
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Examples 



[0063] Commercially available reagents referred to in the examples were used according 
to manufacturers instructions unless otherwise indicated. 

Example 1 

DNA manipulation and sequence analysis 

[0064] Preparation of DNA probes, digestion with restriction endonucleases, DNA 
ligation and transformation of E.coli strains were performed as described 
(Sambrook, J., Fritsch, E.F. & Maniatis, T. (1989). Molecular Cloning: A 
Laboratory Manual Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 
NY.). For DNA sequencing, the ABI PRISM BigDye Terminator Cycle 
Sequencing Ready Reaction Kit and ABI PRISM 310 Genetic analyzer were used. 
PCR were performed in the T3 Thermocycler (Whatman Biometra), using the 
Pfu polymerase (Stratagene). 

Production and Purification of recombinant human sDPP-IV in P. pastoris 
[0065] The ectodomain of DPP-IV, residues 31-766 (sDPP-IV), was amplified by PCR 
using a cDNA and the oligonucleotides 5'- 

TGCTGGAATTCGGCACAGATGATGCTAC-3' (with an EcoRI site in bold) 
and 5'-GCA TGG TAC CTT GAG GTG CTA AG -3* (with a Kpnl site in bold). 
Using the two new restriction sites, the amplified DNA fragment (SEQ ID NO:l) 
was cloned into pPICZa-A vector (Invitrogen) to create a fusion with the a- 
mating factor signal sequence for the secretion of the protein. The use of the 
EcoRI restriction site added the amino acids glutamine and phenylalanine to the 
N-terminus of sDPP-IV. The sequence was confirmed by sequencing. pPICZa- 
sDPP-IV was linearized with Sad, transformed by electroporation in P. pastoris 
strain GS1 15 and the phenotype of the colonies obtained was checked as 
recommended by the distributor Invitrogen. 

[0066] Eight transformants with phenotype MutS were screened for the expression of 
DPP-IV. Colonies were grown at 30°C in YPD medium (1% yeast extract, 2% 
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peptone, 2% glucose) with zeocin (100 ng/ml) to an OD600 of 8-10. Cells were 
collected by centrifiigation and resuspended in YP medium plus 2% methanol. 
The same amount of methanol was added every 24 h. After 48 h the medium of 
each clone was tested for activity (see below). sDPP-IV was then produced in a 
large scale culture using the transformed cell line with the highest activity per 
volume as described (Dale, G.E., D f Arcy, B., Yuvaniyama, C, Wipf, B., Oefher, 
C. 8c D'Arcy, A. (2000). Purification and crystallization of the extracellular 
domain of human neutral endopeptidase (neprilysin) expressed in Pichia 
pastoris. Acta Crystallogr. D 56, 894-897). 

[0067] Ten liters of the collected sDPP-IV supernatant of the selected transformed P. 
pastoris cell line was filtered and concentrated to 180 ml by crossflow 
ultrafiltration (skannette) using a 30 kDA filtration module (AGT Technology 
corporation). The concentrate was passed over a Sephacryl 200 XK 50/100 size 
exclusion column (5 x 95 cm, Pharmacia) equilibrated with 50 mM Tris-HCl pH 
7.8 and 100 mM NaCl (S-buffer). Collected fractions were screened on SDS- 
PAGE and for activity. Fractions containing sDPP-IV were dialysed against 50 
mM Tris-HCl pH 7.9. The protein solution was loaded on a Fractogel-TMAE 
column (2.6 x 13 cm, Merck) equilibrated with 50 mM Tris-HCl pH 7.9, washed 
with two column volumes of the same buffer and eluted with 500 ml of a linear 
gradient from 0 to 200 mM NaCl. Fractions containing sDPP-IV were dialysed 
against 20 mM sodium acetat pH 4.8. The protein solution was loaded on a 
Fractogel-COCT column (1x12 cm, Merck) equilibrated with the same buffer 
and washed with two column volumes of this buffer. Bound proteins were eluted 
with 200 ml of a linear gradient from 50 to 500 M NaCl. The elution profile 
showed a major peak at 250 mM NaCl. Preparation of enzymatically 
deglycosylated sDPP-IV (sDPPIVd eg i y cos) was carried out prior to loading on the 
last gelfiltration column. 0.1% EndoFl-GST was added to the pooled fractions of 
DPP-IV and incubated for 20 h at 21°C. The concentrated protein solution was 
loaded on a Biosec size exclusion column (1.6 x 60 cm, Merck), that was 
equilibrated with S-buffer. Fractions were analyzed by SDS-PAGE, showing a 
purity > 95%. N-terminal sequencing showed that the protein was efficiendy 
processed by the STE13 signal peptidase which cleaves off the a-mating factor. 
Preparation of the sDPPIV deg i ycos :ADA-complex was performed by addition of a 
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two times excess of ADA (Sigma Type IV, from calf intestinal Mucosa) and 
purification using a Biosec-size exclusion column. 

[0068] The soluble extracellular domain of human dipeptidyl peptidase IV (sDPP-IV; 
residues 31-766) was expressed in the yeast Pichia pastoris. The protein was 
secreted at the low level of 1 mg/1 as estimated from the total activity. As a first 
purification step the concentrated protein was passed through a size-exclusion 
column which removed the main fraction of contaminating peptides from the 
yeast-peptone medium. Sequential chromatography on anion- and cation- 
exchanger and a second size exclusion chromatography were used to get protein 
of 95% purity as judged by SDS-PAGE. The yield of pure protein was 0.3 mg/1 
growth medium. The purified protein shows essentially identical kinetic 
parameters and inhibition constants for known inhibitors of DPP-IV to those 
reported for the enzyme purified from human serum (Tables 1 and 2). 

Analytical methods 

[0069] Purification of sDPP-IV was followed by electrophoresis on 10-20% Tricine SDS 
polyacrylamide gradient gels (Lammli, U.K. (1970). Cleavage of structural 
proteins during assembly of the head of bacteriophage T4. Nature 227, 680- 
685). Protein concentrations were determined according to Bradford (Bradford, 
M.M. (1976). A rapid and sensitive method for the quantitation of microgram 
quantities of protein utilizing the principle of protein-dye binding. Anal 
Biochem. 72, 248-254) or for pure protein by absorption spectroscopy using the 
calculated molecular extinction coefficient at 280 nm of 193*920 M" l cm _1 
(A 2 8o ai% = 2.27cm 2 /mg; Pace, C.N., Vajdos, F., Fee, L., Grimsley, G. & Gray, T. 
(1995). How to measure and predict the molar absorption coefficient of a 
protein. Protein Set 4, 2411-2423). Analytical gel filtration chromatography was 
performed on a Superdex 200 12 HR 10/30 column (Pharmacia) equilibrated 
with S-buffer. The eluate was monitored with a miniDAWN multi-angle laser 
light scattering detector (Wyatt) and a refractive index-detector (Shodex), which 
allows the determination of the molecular weight and dispersity over the elution 
peak (Wyatt, PJ. (1993). Light scattering and the absolute characterisation of 
macromolecules. Analytica ChimicaActa 272, 1-40). Sedimentation equilibrium 
runs in a Beckman analytical ultracentrifuge (model Optima XL A) were 
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performed at 20°C and 9000 rpm sDPP-IV deg i ycos and at 7000 rpm for sDPP- 
IVdegiycos^ADA-complex. The initial protein concentrations were 0.22 to 0.25 
mg/ml in S-buffer. The absorption was followed at 280 nm. Assumed partial 
specific volumes for sDPP-IV of 0.729 cm 3 /g and ADA of 0.735 cm 3 /g were used 
to determine the molecular masses. 

[0070] Free sulfhydryl groups were determined according the procedure described by 
Ellman (Ellman, G.L. (1959). Tissue sulfhydryl groups. Arch. Biochem. Biophys. 
82, 70-77) under denaturing conditions (0.3% SDS in 50 mM Tris pH 8.0). 

Thermostability measurements 

[0071] The irreversible loss of activity after incubation at various temperatures was used 
as an operational criterion of the thermostability of sDPP-IV. Kinetics of 
irreversible heat inactivation were performed as described by Sterner et al. 
(Sterner, R., Kleemann, G.R., Szadkowski, H., Lustig, A., Hennig, M. 8c 
Kirschner, K. (1996). Phosphoribosyl anthranilate isomerase from Thermotoga 
maritima is an extremely stable and active homodimer. Protein Sri. 5, 2000- 
2008) with a final protein concentration of 20 Jig/ml in 50 mM potassium 
phosphate buffer at pH 7.5, containing 100 mM NaCl. The residual activity was 
determined by recording the initial velocity at 25°C of the enzyme- catalyzed 
reaction (see below) and the averaged values obtained were plotted against the 
incubation temperature. 

Biacore 

[0072] DPP-IV was immobilized on a CMS surface plasmon resonance sensor 

(Biacore) using standard amide coupling chemistry. The organic adlayer on this 
sensor type consists of carboxymethylated dextran (MW «100 kDA). After 
activation of the carboxylic acid groups using carbodiimide/N- 
hydroxysuccinimide solutions, the surface was contacted with a DPP-IV solution 
(80 |il) containing « 100 |ig/ml protein in acetate buffer (10 mM, pH 4.5). The 
amount immobilized corresponded to a sensor response of roughly 10 000 RU. 
The surfaces of two flow cells were modified with protein. To suppress baseline 
drift - possibly due to slow dimer dissociation - the protein of one cell was 
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cross-linked by short contact with carbodiimide/N-hydroxysuccinimide 
solution. This treatment did not influence the protein activity since binding 
constants determined with cross - linked protein were similar to those 
determined with non-cross-linked protein. Hepes buffer (0.01 M Hepes, pH 7.4, 
0.15 M NaCl, 3 mM EDTA, 0.005% polysorbate 20 (v/v)) was used as the 
running buffer. Diprotin-A was disolved directly in this buffer. NVP-DPP728 
was first dissolved in pure DMSO and then diluted into running buffer. The final 
inhibitor solution contained less than 0.1% DMSO. Binding experiments were 
carried out by contacting the immobilized protein surfaces with inhibitor 
solutions of varying concentrations at a flow rate of 10 fil/min or 30 |il/min. 
After each contact with inhibitor, the protein surfaces were regenerated by 
extensively washing with running buffer. 

Activity assay 

[0073] The activity assay is based on the increase of fluorescence of products compared 
to the substrate Ala-Pro-7-amido-4-trifluoromethylcoumarin (Calbiochem, 
Smith, R.E., Reynolds, CJ. & Elder, E.A. (1992). The evolution of proteinase 
substrates with special reference to dipeptidylpeptidase IV. Histochem. J. 24, 637- 
647). A 20 mM stock solution in 10 % DMF is stored at -20°C until use. 
Purification was followed by using a final substrate concentration of 50 [iM and 
for the determination of kinetic parameters it was varied between 1.5 jaM and 
500 [iM in the assay. DPP-IV activity assays were performed in 96 well plates in a 
total assay volume of 100 jil. The assay buffer consists of S-Buffer containing 0.1 
mg/ml BSA. Fluorescence is detected in a Luminescence Spectrometer LS 50B 
(Perkin Elmer) at an excitation wavelength of 400 nm and an emission 
wavelength of 505 nm. Initial rate constants are calculated by best fit linear 
regression. 

Example 2 

Crystallization and Structure determination 

[0074] For crystallization trials, sDPP-IVdegiycos was concentrated to approximately 10 
mg/ml. A reduced factorial screen was carried out using the vapour diffusion 
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method. Crystals were obtained with 20-25% PEG 3350, 200 mM MgCl 2 , Tris 
pH 8.5 and 15% glycerol. The crystals were flash-frozen in liquid nitrogen and 
exhibit the orthorhombic space group P2i2i2i with cell dimensions of about 65 
A, 68 A and 420 A and one dimer per asymmetric unit. They diffract to a 
maximum of 2.3 A resolution using synchrotron radiation and show rather high 
mosaicity (0.5-1.2°). Addition of 1 mM Diprotin-A prior to crystallization led to 
crystals of the complex. The mercury derivative was produced by 
cocrystallization with 0.1 mM HgCb. 

[0075] Data collection was performed using synchrotron radiation (Swiss light source, 
SLS Villigen, Switzerland and ID 14, ESRF Grenoble, France) as well as in-house 
facilities (search for heavy atom derivatives, evaluation of crystal quality) and 
processed with DENZO (Otwinowski, Z. (1993). Oscillation data reduction 
program. In Proceedings of the CCP4 Study Weekend: Data Collection and 
Processing (Wawyey, L., Isaacs, N. 8c Bailey, S., eds.). pp. 56-62, SERC Daresbury 
Laboratory, UK). Details of the data collection statistics are given in Table 3. All 
programs used are part of the CCP4 (CGP4 (Collaborative Computational 
Project, Number 4) (1994). The CCP4 suite: programs for protein 
crystallography. Acta Crystallogr. D, 760-763) suite, except where indicated. The 
structure was determined by multiwavelength anomalous dispersion (MAD) of 
the mercury derivative. One major mercury binding site per subunit (Cys 551, 
one of the two free SH-groups Cys301 and Cys551 that are located near the 
active site) was identified by inspection of the difference patterson maps 
calculated from the peak wavelength data and was subsequently refined using 
SHARP (De la Fortelle, E. & Bricogne, G. (1997). Maximum likelihood heavy- 
atom parameter refinement for multiple isomorphus replacement and 
multiwavelength anomalous diffraction methods. Methods Enzymol. 276, 472- 
494). Location of the twofold non-crystallographic axis was performed using this 
mercury site and the program find2folds (Dunten, P. & Hennig, M. (2002). 
Locating non-crystallographic symmetry elements: The program Find2Folds. 
Acta Crystallogr. A58, C76). Further analysis revealed another site per subunit 
(Cys301) with less occupancy and the site branched in two positions with about 
2.4A distance. Subsequently the phases were improved by application of twofold 
averaging combined with solvent flattening and histogram matching as 
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implemented in DM. The initial electron density at 2.6 A resolution was readily 
interpretable and about 90% of the polypeptide chain could be built. The 
molecular model was refined against 2.3 A data. Subsequent rounds of manual 
rebuilding and refinement with REFMAC (Murshudov, G.N., Vagin, A.A., 
Lebedev, A., Wilson, K.S; & Dodson, EJ. (1999). Efficient anisotropic 
refinement of macromolecular structures using FFT. Acta Crystallogr. D 55, 
247-255) led to a complete molecular structure of the polypeptide chain from 
residues Ser39 to Pro766. Details of the refined structures are reported in Table 
3. Coordinates have been deposited in the Protein Data Bank PDB. 

Overall structure 

[0076] The structure of human DPP-IV was solved by multiple anomalous dispersion 
(MAD) using a mercury derivative (see Table 3) and subsequently refined to an 
R- factor of 21.5 % at 2.1 A resolution. The current model consists of all residues 
from Ser39 to Pro766 of the amino acid sequence of the expressed ectodomain 
of the protein. 

[0077] A homodimer of DPP-IV is situated in the asymmetric unit (Figure 2). 

Dimerization is also observed in solution under various conditions and is 
required for activity. Each subunit is made of two domains, the catalytic domain 
with an ot/p hydrolase fold containing the catalytic triad (Ser630, Asp708, 
His740) and a domain with an eight-bladed P-propeller fold, the p-propeller 
domain (Figure 2). The assignment of the secondary structure is given in Figures 
1 and 2. The only other known crystal structure of this class of enzyme is prolyl - 
oligopeptidase (POP) determined by Fulop (Fulop, V., Bocskei, Z. & Polgar, L. 
(1998). Prolyl oligopeptidase: an unusual beta-propeller domain regulates 
proteolysis. Cell 94, 161-170; pdb entry lqfm). POP also has an a/p-hydrolase 
and a P-propeller domain, but is monomeric and the P-propeller consists of 
seven repeats only (Figure 3C). 

Catalytic Domain 

[0078] The catalytic domain is built up of residues Gln508 to Pro766 and contains a 

central eight-stranded parallel P-sheet that is flanked by 12 helices known as ct/p 
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hydrolase fold. 21% sequence identity to POP indicates significant structural 
homology (Figure 1) and superposition of the central a-helix, carrying the 
catalytic Ser630 on its first turn, with the corresponding structure of POP gives 
an r.m.s deviation of 2.5 A 2 for 238 residues. The catalytic domain is connected 
to the P-propeller by an N-terminal 15 residue linker, which is considerably 
shorter than the corresponding 76 residue region in POP. The residues lacking 
in DPP-IV are, however, replaced structurally and functionally by the C- 
terminal part of the catalytic domain of the second subunit of the dimer. 

P-propeller domain 

[0079] The P-propeller domain is formed by the residues Lys56 to Asn497. The 

preceding N-terminal residues Ser39 to Leu55 form a loop structure with a small 
a-helix (al*, Figure 1) at the surface and in close proximity to the first residues 
of the catalytic domain. The P-propeller domain consists of an eight-fold repeat 
of a four-stranded antiparallel 6-sheet motif (blade, Figure 3). The blades are in 
circular arrangement such that they form a solvent filled tunnel with a diameter 
of about 13 A. 

[0080] The P-propeller domain in DPP-IV does not form a joint P-sheet motif 
(described as molecular "velcro"; Fiilop, V. 8c Jones, D.T. (1999). Beta 
propellers: structural rigidity and functional diversity. Curr. Opin. Struct. Biol 9, 
715-721; Paoli, M. (2001). Protein folds propelled by diversity. Prog. Biophys. 
Mol Biol 76, 103-130), but rather the blades show a regular arrangement (pi/1 
to P7/4 or P8/4) (Figure 3A) around the central axis forming a ring system that 
is not closed. 

[0081] DPP-IV deviates from the regular P-propeller fold by additional secondary 

structural elements. An anti-parallel P-sheet is inserted in blade two between the 
strands one and two. The tip of the turn carries the residues Argl25 that forms a 
salt bridge with Glu205, that is situated at the C-terminal turn of an a-helix 
(residues Trpl54 to Thrl99), that is inserted between the first and second 
strands of blade 4. Argl25, Glu205 and the neighboring Glu204 form a 
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significant part of the substrate binding site and are mainly responsible for the 
substrate specificity. An further anti-parallel P -sheet motif formed by residues 
Asp230 to Asn263 is inserted between the strands three and four of blade four 
(Figure 3B). This structural element forms a significant part of the dimer 
interface (see below). 

[0082] Whereas the N-terminal P-sheet structure of the propeller has shorter strands 
and is somewhat tilted, the loop connecting the first and second P -sheet is 
longer, shows high temperature factors and may reduce the rigidity of the 
propeller architecture. The reduced stability of the circular domain structure at 
this position might be compensated by an extended hydrophobic cluster that 
consists of Ile63, Leu69, Ile76, Phe89, Leu90, Phe95, Phe98, Ilel07, Ilell4, 
Tyrl35, Leul37 and Leul42, and a salt bridge between Arg61 and Aspl04 and a 
hydrogen bond between the main chain NH of Arg61 and TyrlOS. This 
distortion leads to a reduced height of the propeller at the positions between 
blade one and two (Figure 3B). 

[0083] As no residues from the oc/p hydrolase domain fill this up, a cleft between the 
two domains of the DPP-IV molecule is formed with a diameter of about 15 A 
enabling access to the catalytic site (Figure 4). Therefore, we propose that DPP- 
IV has two independent ways for the substrate and product to access and leave 
the active site, a cleft between the domains and the tunnel through the P~ 
propeller. The open cleft may enable large peptides and partially folded proteins 
to access the active site. The more narrow tunnel could be an exit for the cleaved 
dipeptides (Figure 4). The crystal structure of POP shows that the cleft between 
the two domains does not exist and the tunnel through the P-propeller is more 
narrow with about 4 A compared to about 13 A for DPP-IV (Figure 3A and 3C). 
This structural difference is supported by the observation that DPP-IV can 
process much larger substrates compared to POP. Peptides with a length of up to 
about 80 residues appear to be good substrates of DPP-IV. Larger proteins may 
also be cleaved depending on their tertiary structure. POP is reported to 
hydrolyse substrates with a maximum size of about 30 residues, only (Polgar, L. 
(1992). Unusual secondary specificity of prolyl oligopeptidase and the different 
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reactivities of its two forms toward charged substrates. Biochemistry 31, 7729- 
7735.). As the diameter of the p-propeller tunnel in POP is significantly smaller, 
it is conceivable that the structure of DPP-IV represents a more open and active 
enzyme. 

[0084] The p-propeller motif has been found in several further proteins, but no or only 
low sequence homology could be demonstrated (Polgar, L. (1992). Unusual 
secondary specificity of prolyl oligopeptidase and the different reactivities of its 
two forms toward charged substrates. Biochemistry 31 , 7729-7735.). A search of 
the PDB for homologous structures gave the best results for clathrin (7 blades, 
ter Haar, E., Musacchio, A., Harrison, S.C. & Kirchhausen, T. (1998). Atomic 
structure of clathrin: a beta propeller terminal domain joins an alpha zigzag 
linker. Cell 95, 563-573), methylamine dehydrogenase (7 blades, Chen, L., Doi, 
M., Durley, R.C., Chistoserdov, A.Y., Lidstrom, M.E., Davidson, V.L. 8c 
Mathews, F.S. (1998). Refined crystal structure of methylamine dehydrogenase 
from Paracoccus denitrificans at 1.75 A resolution. /. Mol Biol 276, 131-149) 
and nitrite reductase (8 blades, Nurizzo, D., Cutruzzola, F., Arese, M., 
Bourgeois, D., Brunori, M., Cambillau, C. & Tegoni, M. (1998). Conformational 
changes occurring upon reduction and NO binding in nitrite reductase from 
Pseudomonas aeruginosa. Biochemistry 37, 13987-13996), but no DPP-IV 
related function can be expected. 

Active site 

[0085] The catalytic triad (Ser630, Asp708, His740) is located in a large cavity at the 
interface of the two domains. Ser630 is found at the tip of a very sharp turn 
between p -strand 5 and helix C, called the nucleophile elbow, which is a 
characteristic of hydrolases of the a/p type (Ollis, D.L., Cheah, E., Cygler, M., 
Dijkstra, B., Frolow, F., Franken, S.M., Harel, M., Remington, S.J., Silman, L, 
Schrag, J. & et al. (1992). The alpha/beta hydrolase fold. Protein Eng. 5, 197- 
211). The serine hydroxy group is well exposed to solvent and hydrogen bonded 
to the catalytic imidazole group of His740 on one side (2.6 A) and accessible to 
the substrate on the other side. His740 is found in the middle of a loop between 
P-strand 8 and helix F. With a distance of 2.75 A to Ns of the imidazole ring, one 
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of the oxygen atoms of Asp708 is hydrogen bonded to His740 and completes the 
catalytic triad (Figure 5). The other oxygen atom of the carboxylate group of 
Asp708 is coordinated by two main chain NH-groups (Val711 and Asn710). 
Thus, the location and geometry of the triad are very similar to that found in 
other a/p hydrolases with the "handedness" opposite to the classical serine 
peptidases. 

[0086] The negatively charged oxyanion of the tetrahedral intermediate is stabilized by 
the main chain NH-group of Tyr631 and by the hydroxy group of Tyr547 
(Figure 5). Furthermore, the structure shows that the two Gly628 and Gly632 are 
important for the formation of the sharp turn to bring the catalytic residue 
Ser630 in the correct position. This is in accordance with mutagenesis studies on 
rat DPP-IV (Ogata, S., Misumi, Y., Tsuji, E., Takami, N., Oda, K. & Ikehara, Y. 
(1992). Identification of the active site residues in dipeptidyl peptidase IV by 
affinity labeling and site-directed mutagenesis. Biochemistry 31, 2582-2587) 
showing that the sequence Gly628-X-Ser 6 30-Tyr 6 3i-Gly 6 32 is essential for DPP-IV 
activity. 

Substrate binding 

[0087] The substrate binding site of DPP-IV is indicated by the inhibitor Diprotin-A 
(Ile-Pro-Ile). It is a slowly hydrolysable substrate with kcat/KM a factor of 10 less 
than Ile-Pro-4-nitroanilides (Rahfeld, J., Schierhorn, M., Hartrodt, B., Neubert, 
K. 8c Heins, J. (1991). Are diprotin A (Ile-Pro-Ile) and diprotin B (Val-Pro-Leu) 
inhibitors or substrates of dipeptidyl peptidase IV? Biochim. Biophys. Acta 1076, 
314-316). Inspection of the electron density map shows the ligand covalently 
bound to the active site Ser630 of the enzyme in both subunits. The N-terminal 
He (P2) and Pro residues (PI) are well defined and enable a detailed analysis of 
the interaction with the substrate binding site (according to the notation of 
Schechter; Schechter, I. 8c Berger, A. (1968). On the active site of proteases. 3. 
Mapping the active site of papain; specific peptide inhibitors of papain. Biochem. 
Biophys. Res. Commun. 32, 898-902). Less well defined electron density is found 
for the C-terminal He (PT), but in subunit B the conformation of this part of the 
ligand could also be observed (Figure 5). The side chain Ne of the catalytic 
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His740 is in hydrogen bonding distance to the NH-group of PT (2.90 A) and to 
the Oy of the Ser630 side chain (2.74 A). 

[0088] DPP-IV hydrolyzes oligopeptides and proteins from the N-terminus, cleaving 
dipeptide units when the second residue is proline, hydroxyproline, 
dehydroproline, pipecolic acid or alanine. In both subunits the proline in 
position PI of Diprotin-A is in the trans-configuration and fits optimally into 
the pocket of the active site as expected (Fischer, G., Heins, J. 8c Barth, A. (1983). 
The conformation around the peptide bond between the PI- and P2-positions is 
important for catalytic activity of some proline-specific proteases. Biochim. 
Biophys. Acta 742, 452-462). The SI pocket is formed by Val71 1, Val656, 
Tyr662, Tyr666, Tyr659 and Tyr631 which shape a well defined hydrophobic 
pocket that would be filled by proline much better than by alanine. Gly is also 
accepted, but with very low kc at /KM values (Brandt, W., Lehmann, T., Thondorf, 
L, Born, L, Schutkowski, M., Rahfeld, J.U., Neubert, K. 8c Barth, A. (1995). A 
model of the active site of dipeptidyl peptidase IV predicted by comparative 
molecular field analysis and molecular modelling simulations. Int. J. Pept. 
Protein Res. 46, 494-507). All other naturally ocurring amino acids residues 
cannot occupy position PI. Either the side chains are too bulky or hydrophilic. 
The side chains of the residues P2 and PT point into the solvent and no 
interaction with the protein occurs. This explains the large diversity of amino 
acids accepted in substrates at these positions. 

[0089] Essential for substrate binding and catalysis is the N-terminus of the substrates, 
which has to be unprotected and protonated (Brandt, W., Ludwig, O., 
Thondorf, I. 8c Barth, A. (1996). A new mechanism in serine proteases catalysis 
exhibited by dipeptidyl peptidase IV (DP IV) - Results of PM3 semiempirical 
thermodynamic studies supported by experimental results. Eur. J. Biochem. 236, 
109-114). The Diprotin-A complex shows that the terminal -NH 3 + -group is 
held very precisely in position by strong interactions with the carboxylates of 
Glu205 and Glu206 (Figure 5). A third glutamate, Glu204, stabilizes this 
substrate recognition site by an hydrogen bonding network with the backbone 
NH of Argl25, Hisl26 and Serl27 as well as the hydroxy group of Serl27. 
Importance of the glutamate residues is confirmed by single point mutations 
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that abolish DPP-IV activity (Abbott, C.A., McCaughan, G.W. & Gorrell, M.D. 
(1999). Two highly conserved glutamic acid residues in the predicted beta 
propeller domain of dipeptidyl peptidase IV are required for its enzyme activity. 
FEBS Lett. 458, 278-284). The double Glu-motif is located at the end of an 
helical segment (<x2* in Figure 1, see also Figure 3) that is highly conserved in the 
DPP IV-like gene family (Asp-Trp-X-Tyr-Glu-Glu-Glu-X). The helix represents 
a deviation from the regular 0-sheet architecture of the P-propeller domain 
(Figures 1 and 3A). The superposition of the active sites of the exopeptidase 
DPP-IV complexed with Diprotin A and the endopeptidase POP complexed 
with an octapeptide (Fulop, V., Szeltner, Z., Renner, V. & Polgar, L. (2001). 
Structures of prolyl oligopeptidase substrate/inhibitor complexes. Use of 
inhibitor binding for titration of the catalytic histidine residue. /. Biol Chem. 
276, 1262-1266) shows clear differences. The octapeptide substrate of POP 
coincides with the double Glu-motif in DPP-IV indicating that this additional 
structural element functions is very important for substrate selection. Thus, the 
double Glu-motif is a recognition site for the N-terminus of substrates and 
restricts the cleavage to dipep tides and the SI pocket provides an optimal 
binding to proline and alanine residues leading to a highly specific peptidase. 

Mode of inhibition by Diprotin-A 

[0090] Inspection of the electron density of the bound inhibitor shows a covalent 
linkage to Ser630 and a sp 3 -configuration for the C-atom of the former 
carbonyl-group of the scissile peptide. Consequently, a tetrahedral intermediate 
is observed in the complex structure with the substrate Diprotin A (Figure 5) 
with the oxyanion stabilized by hydrogen bonds to the hydroxy group of the side 
chain of Tyr547 (2.80 A) and the main chain amine of Tyr631 (3.38 A). As much 
catalytic power of serine proteases derives from its preferential binding of this 
transition state, the tetrahedral intermediate is a well-defined but high energy 
state with a short lifetime and its accumulation must be a result of a kinetic 
barrier. 



[0091] Inspection of the active site structure reveals several structural features that are 
special to Diprotin A and may lead to the competitive inhibition of this 
substrate. First, the two hydrophobic isoleucine side chains point into the same 
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direction in proximity and, therefore, this hydrophobic interaction may stabilize 
the tripeptide in a unsuitable conformation for the progress of the reaction. 
Second, a large network of salt bridges and hydrogen bonds stabilize the 
complex. It involves the carboxyl groups of Glu205/206 that interact with the N- 
terminus of the tripeptide, but Glu205 makes another salt bridge to Argl25 and 
this in turn interacts with the C-terminal carboxyl group of the tripeptide 
(Figure 5). It is obvious that this interaction is only present in tripeptidic 
substrates and may stabilize the observed intermediate by protection of the 
leaving group. 

Dimerization 

[0092] The crystal structure as well as analytical ultracentrifugation indicate dimeric 

oligomerization for deglycosylated sDPP-IV with a molecular weight of 169 kDa 
and non-crystallographic twofold symmetry (Figure 2). Six percent or 1837 A 2 of 
the total solvent accessible surface area of each subunit is buried in the dimer 
interface (program XSAE, Broger, C. personal communication). This interface is 
mainly build up by two extra P-strands (pi* and P2*) in the loop between the 
strands two and three of the fourth blade of the P-propeller domain (Figure 3A 
and 3B). Further interaction is provided by the ot/p hydrolase domain with helix 
aE, P-strand P8 and helix otF with mainly hydrophobic interactions. The active 
site is very close to this dimer interface (Figure 2) with His740 from the catalytic 
triad located in the loop connecting ctF and P7 (Figure 1). Consequently 
disruption of the dimer interface would also strongly affect the catalytic activity 
and dimerization is required for activity. 

Stability ofDPP-IV 

[0093] As a cell surface protein DPP-IV is extremely stable. Consequently the 

recombinant sDPP-IV shows a half life of 5 min at 71°C in irreversible heat 
inactivation experiments independent of the protein concentration and the 
degree of glycosylation indicating high thermal stability. In unfolding 
experiments (Lambeir, A.M., Diaz Pereira, J.F., Chacon, P., Vermeulen, G., 
Heremans, K., Devreese, B., Van Beeumen, J., De Meester, L & Scharpe, S. 
(1997). A prediction of DPP IV/CD26 domain structure from a physico- 
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chemical investigation of dipeptidyl peptidase IV (CD26) from human seminal 
plasma. Biochim. Biophys. Acta 1340, 215-2) with protein purified from human 
seminal plasma DPP-IV retained its native conformation up to 8 M Urea. 

[0094] The crystal structure points to several factors that may contribute to this 
stability. Firstly, the structural organization as a dimer with an extended 
hydrophobic interface stabilizes the molecule as shown for several other proteins 
(Thoma, R., Hennig, M., Sterner, R. & Kirschner, K. (2000). Structure and 
function of mutationally generated monomers of dimeric 
phosphoribosylanthranilate isomerase from Thermotoga maritima. Structure 
Fold. Des. 8, 265-276). Secondly, we observe five disulphide bonds and two free 
sulfhydryl groups by SH titration experiments under denaturing conditions that 
are now confirmed by the X-ray structure. All disulphide bridges in the 0- 
propeller connect different strands in blades or stabilize loops (Cys444/Cys447; 
Cys385/Cys394, Cys454/Cys472, Cys328/Cys339). One disulfide bond is 
observed in the ot/0-hydrolase domain (Cys649/Cys762) and covalendy links the 
C-terminal helix aF to the core of the oc/p hydrolase domain. 

Glycosylation 

[0095] sDPP-IV overexpressed in P. pastoris shows a decreasing molecular weight over 
the elution peak in the analytical gelfiltration as analyzed online with a 
multiangle laser light scattering detector. In contrast, sDPP-IV deglycosylated 
with EndoF glycosidase shows an uniform molecular weight over the whole peak 
range, because of the specific cleavage of asparagine linked oligomannose after 
the first N-acetylglucoamines residue (GlcNAc). This leads to a decrease in 
molecular weight of 20 kDa as estimated by SDS-PAGE. Crystals suitable for X- 
ray diffraction are only observed for deglycosylated sDPP-IV and structure 
analysis shows four GlcNAc with interpretable electron density at the positions 
N85, N150, N229 and N281 in subunit A. In subunit B, again N85, N150 and 
N229 are visible, but no electron density was found for N281 and an additional 
site could be identified at N92. The GlcNAc of N85 is involved in a crystal 
contact in both subunits. 
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[0096] DPP-IV expressed in human has a more complex type of glycosylation 

compared to P. pastoris (Cremata, J., Montensino, R., Quintero, O. & Garcia, R. 
(1998). Glycosylation Profiling of Heterologous Proteins. In Pichia Protocols 
(Higgins, D.R. & Cregg, J.M., eds.), vol. 103. pp. 95-106, Humana Press: 
Totowa, New Jersey) and contains terminal sialic acid, however, this seems not 
to be a requirement for correct folding as shown here. 



Interaction with ADA 

[0097] Adenosine deaminase (ADA; EC 3.5.4.4) is a 41 kDa protein expressed in all 
mammaliantissues that catalyzes the deamidation of adenosine and 2 y - 
deoxyadenosine to inosine and 2'-deoxyinosine, respectively. It is important for 
the regulation of the extracellular concentration of adenosine and for the 
regulation of the immune response. ADA is involved in T cell activation in 
general and the pathogensis of autoimmune disorders (such as rheumatoid 
arthritis) as well as the mechanism of immunodeficiency disease (such as SCID 
or AIDS). Binding of the soluble extracellular ADA is a unique property of DPP- 
IV molecules of higher mammals and is not observed in mouse nor rat DPP-IV 
(Iwaki-Egawa, S., Watanabe, Y. & Fujimoto, Y. (1997). CD26/dipeptidyl 
peptidase IV does not work as an adenosine deaminase-binding protein in rat 
cells. Cell Immunol 178, 180-186). Using analytical ultra-centrifugation, we 
observe a 1:1 complex of a ADA molecules with a sDPP-IV subunit giving a 
molecular weight of 252 kDa. Surface plasmon resonance (Biacore) 
measurements show a binding constant of 3.15 ± 2 nM to ADA from bovine 
with a very low dissociation rate (koff=8.75*10~ 5 s" 1 , k on =2.98 5f 10 4 M"V l ) 
indicating a strong interaction. 

[0098] Mutagenesis studies (Abbott, C.A., McCaughan, G.W., Levy, M.T., Church, 
W.B. 8c Gorrell, M.D. (1999). Binding to human dipeptidyl peptidase IV by 
adenosine deaminase and antibodies that inhibit ligand binding involves 
overlapping, discontinuous sites on a predicted beta propeller domain. Eur. J. 
Biochem. 266, 798-810; Dong, R.P., Tachibana, K., Hegen, M., Munakata, Y., 
Cho, D., Schlossman, S.F. 8c Morimoto, C. (1997). Determination of adenosine 
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deaminase binding domain on CD26 and its immunoregulatory effect on T cell 
activation. /. Immunol 159, 6070-6076) identified two important regions in 
DPP-IV Leu34o-Val34i-Ala 3 42-Arg343 (at the beginning of p5/4) and Leu294 (ct4, 
at the end of blade 4) and a less important region Glu 3 32-Ser333-Ser33 4 -Gly335- 
Arg 33 6 (loop region, at the end of £5/3) that are all located at the surface of the 0- 
propeller domain (Figure 1). Mutation to amino acids found in rat DPP-IV 
reduces binding affinity to ADA. These residues form a binding site that is 
located far away from the active site (Figure 2) confirming the independence of 
DPP-IV activity on ADA binding (Table 1; De Meester, I., Vanham, G., Kestens, 
L., Vanhoof, G., Bosnians, E., Gigase, P. & Scharpe, S. (1994). Binding of 
adenosine deaminase to the lymphocyte surface via CD26. Eur. /. Immunol 24, 
566-570). It is concluded that the function of DPP-IV is the localization and 
orientation of ADA for proper catalysis. The structure gives an indication for the 
orientation and localization at the cell surface, because the N-terminus must be 
close to the membrane and the ADA binding would be on the opposite site of 
the molecule - pointing away from the cell surface (Figure 2). Further, there 
would be sufficient space enabling interaction of ADA to the A 1 -adenosine 
receptor (Ciruela, F., Saura, C., Canela, E.I., Mallol, J., Lluis, C. & Franco, R. 
(1996). Adenosine deaminase affects ligand-induced signaling by interacting 
with cell surface adenosine receptors. FEBS Lett. 380, 219-223) which probably 
plays an important role in the ontogenesis of immune tissues. This view would 
also support the hypothesis proposing a link for cell-cell interaction via the 
binding of DPP-IV, ADA and Al-adenosine. 

Biological Implications 

[0099] The crystal structure of DPP-IV at 2.1 A resolution reveals a V-shaped dimeric 
molecule with an extended dimer interface fostering the conformation of the 
overall molecule. The membrane association and stability of DPP-IV is used for 
binding of other proteins like ADA in order to achieve localization without 
disturbance of the enzymatic functionality. 

[0100] Analysis of the complex with Diprotin A shows key structural features for 

proline specific exopeptidase specificity and activity. The negative charge of the 
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double Glu motif guides the N-terminus of the peptide to the active site and 
fixes the substrate in the correct position for cleavage. The distance between this 
motif and the catalytic Ser630 limits the cleavage to dipep tides and the SI pocket 
can just adopt proline or with less affinity alanine as side chains. 

[0101] The low turnover rate of Diprotin A may be explained by the hydrophobic 
interaction of the two lie-residues in the P2 and PI' positions as well as an 
extensive salt bridge cluster that involves the negatively charged C-terminus of 
Diprotin A. This structural information will aid the design of new specific 
inhibitors. 

[0102] The active site is very accessible to the solvent by two entrances explaining that 
peptides can be cleaved by DPP-IV with almost no size limitation. A second 
access to the active site by the tunnel of the p-propeller domain is large enough 
to enable the release of the cleaved dipeptides. This structural arrangement 
certainly improves the catalytic turnover and is in great contrast to the crystal 
structure of POP that shows a much more narrow tunnel and no further access 
to the active site. 

[0103] For most of the special features of DPP-IV namely dimerization, regulation of 
substrate access via two entrances, recognition of the substrate (double Glu- 
motif) and interaction with other proteins like ADA the P-propeller domain 
plays a key role. Thus, DPP-IV is an excellent example that the P-propeller fold 
can be tailored to adapt to different functionality. 



Table 1. Enzyme Kinetic Constants of DPP-IV 


proteins 


kcat 


K M * 


kcat/KM 




(s 1 ) 


ixM) 




sDPP-IVdeglycos 


43.1 


17.2 


2.51 


SDPP-IVgJycos 


37.3 


15.5 


2.41 


sDPP-IV deglycos ./ADA 


39.6 


14.8 


2.68 


* analyzed using Lineweaver-Burk plots; buffer: 50 mM Tris/HCl pH 7.8, containing 100 mM NaCl, 0.1 
mg/ml BSA and 0.5% Dimethyl-formamid; temperature: 25°C 
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Table 2. Kj and K D Values of DPP-IV Inhibitors 




K, 


K D 




koflf 




uM) 


uM) 


M-'s" 1 


s-» 


Ile-Pro-Ile 


4.63* 


3.8 T 






NVP-DPP728 


0.006 * 


0.002 f 


1.36* 10 6t 


2.48*10~ 3t 


NVP-DPP728 (Ut) * 


0.011 


0.010 


1.3*10 5 


1.3*10" 3 



measured with biacore; buffer: 0.01 M Hepes, pH 7.4, containing 0.15 M NaCl, 3 mM 
EDTA, 0.005% polysorbate 20 (v/v) 

* temperature: 25°C; in assay buffer (see Table 1); glycosylated sDPP-IV 

* Hughes, T.E., Mone, M.D., Russell, M.E., Weldon, S.C. & Villhauer, E.B. (1999). NVP- 
DPP728 (l-[[ [2- [(5-cyanopyridin-2-yl) amino] ethyl] amino] acetyl] -2-cyano-(S)- pyrrolidine), 
a slow-binding inhibitor of dipeptidyl peptidase IV. Biochemistry 38, 1 1597- 1 1603 



Table 3. Crystallographic Data and Refinement Statistics 



Data set 


MAD 


MAD 


MAD 


Apo 


Diprotin-A 




Remote 


Peak 


Inflection 




complex 


Wavelength 


0.992 


1.0065 


1.009 


0.9765 


0.92 


X-ray source 


SLS 


SLS 


SLS 


ID14, ESRF 


SLS 


Detector 


MAR IP a 


MAR IP a 


MAR IP a 


Quantum 


MAR CCD 










CCD 




Exposure time/frame (s) 


10 


10 


10 


2 


4 


angular increment per frame (°) 


2.0 


2.0 


2.0 


0.25 


0.25 


total rotation range (°) 


110 


136 


140 


130 


130 


crystal to detector distance (mm] 


410 


410 


410 


240 


260 


unit cell parameters a,b,c (A) 


65.2; 68.7; 


65.2; 68.7; 


65.2; 68.7; 


65.5; 68.2; 


65.1; 67.1; 




420.1 


420.1 


420.1 


419.3 


419.6 


data reduction 


Maximum Resolution (A) 


2.6 


2.6 


2.6 


2.1 


2.5 


No. of measurements 


212619 


263 910 


276 921 


234 528 


171 090 


No. of unique reflections 


58 627 


59 544 


59 939 


87 113 


64 208 


completeness (%)* 


97.5 (99.4) 


99.9(100.0) 


99.9 (99.9) 


82.9 (72.3) 


97.5 (99.4) 


Rsym *, b * 


9.1 (15.9) 


9.0(18.1) 


8.6(14.2) 


8.4 (26.8) 


9.1(15.9) 
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heavy- atom refinement paramet 

f (e) / r (e) -7.0/9.5 -8.0 / 9.8 -12.1/5.0 

Phasing power c (anomalous) 0.95 1.0 0.7 



Refinement statistics 



resolution range (A) 


20-2.1 


30-2.5 


Roy* (Rfree)' (%) 


21.5 (26.5) 


22.5 (28.2) 


No. of protein atoms c (mean B 


11 962 


11 962 (27.1) 


in A 2 ) 


(34.6) 




No. of water molecules 


322 (33.4) 


268 (25.0) 


No. of ligand/heavy atoms 


. 6 (77.3) 


24 (28.3) 


(mean B in A 2 ) 






No. of NAG atoms (mean B in 


112 (59.0) 


98 (51.4) 


A 2 ) 






rmsd f bonds (A 2 ) 


0.018 


0.019 


Rmsd f angles (°) 


1.86 


2.07 



a Marresearch image plate detector, diameter 345mm, lOOum pixel size 
Values in parentheses are statistics for highest resolution bin. 

b R*ym = £ h £i|Ii(h)-<I(h)>|/E h Ei(h), where Ii(h) und <I(h)> are the ith and mean measurement of the intensity of 
reflection h. 



c Phasing power = £ h F H (h)/£ h |F D (h) - |F N (h) + F„(h)||. 

d Z h ||F obs | - |F cak ||/£ h |F obs |, where |F ob$ | and |F calc | are the observed and calculated structure factor amplitudes for the 

reflection h, applied to the working (R^t ) and test (R frcc )sets, respectively. 

c Non-hydrogen atoms, only. 

f rmsd: root mean square deviation from mean. 
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Table 4: Structure coordinates for human DPP-IV 

[0104] Table 4 lists the atomic structure coordinates for DPP-IV as derived by X-ray 
diffraction from a crystal of DPP-IV. 



HEADER 
COMPND 
COMPND 
SOURCE 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 



DPP-IV 

Human Dipeptidyl peptidase IV 
human 



refinement) 



REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK . 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 

REMARK 



1 
1 

2 
2 
2 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 



REFINEMENT REMARKS: 



"apo" -structure 

(mercury derivative different from MAD experiment used for 



2 . 1A resolution 



REFINEMENT. 
PROGRAM 
AUTHORS 



REFMAC 5.0 

MURSHUDOV, VAGIN, DODSON 



REFINEMENT TARGET 



MAXIMUM LIKELIHOOD 



DATA USED IN REFINEMENT. 
RESOLUTION RANGE HIGH (ANGSTROMS) 
RESOLUTION RANGE LOW (ANGSTROMS) 
DATA CUTOFF (SIGMA (F) ) 

COMPLETENESS FOR RANGE (%) 
NUMBER OF REFLECTIONS 

FIT TO DATA USED IN REFINEMENT. 
CROSS-VALIDATION METHOD 
FREE R VALUE TEST SET SELECTION 
R VALUE (WORKING + TEST SET) 

R VALUE (WORKING SET) 

FREE R VALUE 

FREE R VALUE TEST SET SIZE (%) 
FREE R VALUE TEST SET COUNT 

FIT IN THE HIGHEST RESOLUTION BIN. 
TOTAL NUMBER OF BINS USED 
BIN RESOLUTION RANGE 
BIN RESOLUTION RANGE 
REFLECTION IN BIN 
BIN R VALUE 
BIN FREE R VALUE SET 
BIN FREE R VALUE 



2.10 
12.00 
NONE 
82.99 

87113 



THROUGHOUT 

RANDOM 

0.21747 

0.21485 

0.26560 

5.0 

4619 



HIGH 
LOW 

(WORKING SET) 
(WORKING SET) 
COUNT 



20 
2.100 
2. 153 
2014 
0.246 
91 
0.278 



NUMBER OF NON- HYDROGEN ATOMS USED IN REFINEMENT. 
ALL ATOMS : 12366 



ESTIMATED OVERALL COORDINATE ERROR. 
ESU BASED ON R VALUE 
ESU BASED ON FREE R VALUE 
ESU BASED ON MAXIMUM LIKELIHOOD 
ESU FOR B VALUES BASED ON MAXIMUM LIKELIHOOD 



(A) 
(A) 
(A) 
(A**2) 



0.280 
0.228 
0.244 
9.427 
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REMARK 


3 






























REMARK 


3 


RMS DEVIATIONS 


FROM 


IDEAL VALUES 




COUNT 


RMS 


WEIGHT 


REMARK 


3 




BOND LENGTHS REFINED ATOMS 




(A) : 


12400 ; 


0.018 , 


• 0.021 


REMARK 


3 




BOND LENGTHS OTHERS 








(A) : 


10588 ; 


0.001 , 


• 0.020 


REMARK 


3 




BOND ANGLES REFINED ATOMS 


(DEGREES) : 


16876 ; 


1.867 , 


• 1.936 


REMARK 


3 




BOND ANGLES OTHERS 




(DEGREES) : 


24632 ; 


0.889 , 


• 3.000 


REMARK 


3 




TORSION ANGLES 


, PERIOD 


1 


(DEGREES) 


1454 ; 


5.183 , 


• 3.000 


REMARK 


3 




TORSION ANGLES 


, PERIOD 


3 


(DEGREES) 


2075 ; 


19.350 , 


•15.000 


REMARK 


3 




CHIRAL-CENTER 


RESTRAINTS 




(A**3) 


1790 ; 


0.135 , 


• 0.200 


REMARK 


3 




GENERAL PLANES 


REFINED 


ATOMS 




(A) 


13738 ; 


0.007 , 


• 0.020 


REMARK 


3 




GENERAL PLANES 


OTHERS 








(A) 


2674 ; 


0.004 , 


• 0.020 


REMARK 


3 




NON-BONDED 


CONTACTS REFINED ATOMS 


(A) 


2592 ; 


0.240 , 


• 0.300 


REMARK 


3 




NON- BONDED 


CONTACTS OTHERS 




(A) 


10721 ; 


0.223 , 


• 0.300 


REMARK 


3 




NON- BONDED 


TORSION 


OTHERS 






(A) 




17 ; 


0.494 , 


• 0.500 


REMARK 


3 




H-BOND (X. 


. .Y) 


REFINED 


ATOMS 




(A) 


820 ; 


0.155 , 


* 0.500 


REMARK 


3 




H-BOND (X. 


. -Y) 


OTHERS 








(A) 




7 ; 


0.115 


• 0.500 


REMARK 


3 




SYMMETRY VDW REFINED ATOMS 




(A) 




9 ; 


0.235 


; 0.300 


REMARK 


3 




SYMMETRY VDW OTHERS 








(A) 




38 ; 


0.277 


; 0.300 


REMARK 


3 




SYMMETRY H 


-BOND REFINED ATOMS 




(A) 




3 ; 


0.397 


? 0.500 


REMARK 


3 






























REMARK 


3 


ISOTROPIC THERMAL FACTOR RESTRAINTS 


. 


COUNT 


RMS 


WEIGHT 


REMARK 


3 




MAIN-CHAIN 


BOND REFINED ATOMS 


(A**2) 


7252 ; 


0.874 ; 1.500 


REMARK 


3 




MAIN-CHAIN 


ANGLE REFINED ATOMS 


(A**2) 


: 11766 ; 


1.603 ; 2.000 


REMARK 


3 




SIDE-CHAIN 


BOND REFINED ATOMS 


(A**2) 


: 5148 ; 


2.300 ; 3.000 


REMARK 


3 




SIDE-CHAIN 


ANGLE REFINED ATOMS 


(A**2) 


: 5110 ; 


3.638 ; 4.500 


REMARK 


3 






























REMARK 


3 


NCS RESTRAINTS 


STATISTICS 


















REMARK 


3 




NUMBER OF 


NCS 


GROUPS : 


NULL 
















REMARK 


3 






























REMARK 


3 






























REMARK 


3 






























REMARK 


4 




data collected at 


100K at 


ID14 


in 


Grenoble (ESRF, 


France) 


REMARK 


4 




Phasing by MAD using Hg derivative and data collected 


to 2.7 


REMARK 


4 




at Villigen (SLS, 


Switzerland) 
















REMARK 


4 
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6 


A 
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PRO 


ASP 


GLY 


GLN 


PHE 


ILE 


LEU 


LEU 


SEQRES 


7 


A 


728 GLU 


TYR 


ASN 


TYR 


VAL 


LYS 


GLN 


TRP 


ARG 


HIS 


SER 


TYR 


THR 


SEQRES 


8 


A 


728 ALA 


SER 


TYR 


ASP 


ILE 


TYR 


ASP 


LEU 


ASN 


LYS 


ARG 


GLN 


LEU 


SEQRES 


9 


A 


728 ILE 


THR 


GLU 


GLU 


ARG 


ILE 


PRO 


ASN 


ASN 


THR 


GLN 


TRP 


VAL 


SEQRES 


10 


A 


728 THR 


TRP 


SER 


PRO 


VAL 


GLY 


HIS 


LYS 


LEU 


ALA 


TYR 


VAL 


TRP 


SEQRES 


11 


A 


728 ASN 


ASN 


ASP 


ILE 


TYR 


VAL 


LYS 


ILE 


GLU 


PRO 


ASN 


LEU 


PRO 


SEQRES 


12 


A 


728 SER 


TYR 


ARG 


ILE 


THR 


TRP 


THR 


GLY 


LYS 


GLU 


ASP 


ILE 


ILE 


SEQRES 


13 


A 


728 TYR 


ASN 


GLY 


ILE 


THR 


ASP 


TRP 


VAL 


TYR 


GLU 


GLU 


GLU 


VAL 


SEQRES 


14 


A 


728 PHE 


SER 


ALA 


TYR 


SER 


ALA 


LEU 


TRP 


TRP 


SER 


PRO 


ASN 


GLY 


SEQRES 


15 


A 


728 THR 


PHE 


LEU 


ALA 


TYR 


ALA 


GLN 


PHE 


ASN 


ASP 


THR 


GLU 


VAL 


SEQRES 


16 


A 


728 PRO 


LEU 


ILE 


GLU 


TYR 


SER 


PHE 


TYR 


SER 


ASP 


GLU 


SER 


LEU 


SEQRES 


17 


A 


728 GLN 


TYR 


PRO 


LYS 


THR 


VAL 


ARG 


VAL 


PRO 


TYR 


PRO 


LYS 


ALA 


SEQRES 


18 


A 


728 GLY 


ALA 


VAL 


ASN 


PRO 


THR 


VAL 


LYS 


PHE 


PHE 


VAL 


VAL 


ASN 


SEQRES 


19 


A 


728 THR 


ASP 


SER 


LEU 


SER 


SER 


VAL 


THR 


ASN 


ALA 


THR 


SER 


ILE 


SEQRES 


20 


A 


728 GLN 


ILE 


THR 


ALA 


PRO 


ALA 


SER 


MET 


LEU 


ILE 


GLY 


ASP 


HIS 


SEQRES 


21 


A 


728 TYR 


LEU 


CYS 


ASP 


VAL 


THR 


TRP 


ALA 


THR 


GLN 


GLU 


ARG 


ILE 


SEQRES 


22 


A 


728 SER 


LEU 


GLN 


TRP 


LEU 


ARG 


ARG 


ILE 


GLN 


ASN 


TYR 


SER 


VAL 


SEQRES 


23 


A 


728 MET 


ASP 


ILE 


CYS 


ASP 


TYR 


ASP 


GLU 


SER 


SER 


GLY 


ARG 


TRP 


SEQRES 


24 


A 


728 ASN 


CYS 


LEU 


VAL 


ALA 


ARG 


GLN 


HIS 


ILE 


GLU 


MET 


SER 


THR 


SEQRES 


25 


A 


728 THR 


GLY 


TRP 


VAL 


GLY 


ARG 


PHE 


ARG 


PRO 


SER 


GLU 


PRO 


HIS 


SEQRES 


26 


A 


728 PHE 


THR 


LEU 


ASP 


GLY 


ASN 


SER 


PHE 


TYR 


LYS 


ILE 


ILE 


SER 


SEQRES 


27 


A 


728 ASN 


GLU 


GLU 


GLY 


TYR 


ARG 


HIS 


ILE 


CYS 


TYR 


PHE 


GLN 


ILE 


SEQRES 


28 


A 


728 ASP 


LYS 


LYS 


ASP 


CYS 


THR 


PHE 


ILE 


THR 


LYS 


GLY 


THR 


TRP 
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SEQRES 


29 


A 


728 


GLU 


VAL 


ILE 


GLY 


SEQRES 


30 


A 


728 


TYR 


TYR 


ILE 


SER 


SEQRES 


31 


A 


728 


ARG 


ASN 


LEU 


TYR 


SEQRES 


32 


A 


728 


VAL 


THR 


CYS 


LEU 


SEQRES 


33 


A 


728 


GLN 


TYR 


TYR 


SER 


SEQRES 


34 


A 


728 


TYR 


GLN 


LEU 


ARG 


SEQRES 


35 


A 


728 


THR 


LEU 


HIS 


SER 


SEQRES 


36 


A 


728 


LEU 


GLU 


ASP 


ASN 


SEQRES 


37 


A 


728 


VAL 


GLN 


MET 


PRO 


SEQRES 


38 


A 


728 


ASN 


GLU 


THR 


LYS 


SEQRES 


39 


A 


728 


HIS 


PHE 


ASP 


LYS 


SEQRES 


40 


A 


728 


VAL 


TYR 


ALA 


GLY 


SEQRES 


41 


A 


728 


PHE 


ARG 


LEU 


ASN 


SEQRES 


42 


A 


728 


ASN 


ILE 


ILE 


VAL 


SEQRES 


43 


A 


728 


TYR 


GLN 


GLY 


ASP 


SEQRES 


44 


A 


728 


LEU 


GLY 


THR 


PHE 


SEQRES 


45 


A 


728 


ARG 


GLN 


PHE 


SER 


SEQRES 


46 


A 


728 


ILE 


ALA 


ILE 


TRP 


SEQRES 


47 


A 


728 


SER 


MET 


VAL 


LEU 


SEQRES 


48 


A 


728 


GLY 


ILE 


ALA 


VAL 


SEQRES 


49 


A 


728 


ASP 


SER 


VAL 


TYR 


SEQRES 


50 


A 


728 


PRO 


GLU 


ASP 


ASN 


SEQRES 


51 


A 


728 


MET 


SER 


ARG 


ALA 


SEQRES 


52 


A 


728 


LEU 


ILE 


HIS 


GLY 


SEQRES 


53 


A 


728 


GLN 


SER 


ALA 


GLN 


SEQRES 


54 


A 


728 


VAL 


ASP 


PHE 


GLN 


SEQRES 


55 


A 


728 


GLY 


ILE 


ALA 


SER 


SEQRES 


56 


A 


728 


HIS 


MET 


SER 


HIS 


SEQRES 


1 


B 


728 


SER 


ARG 


LYS 


THR 


SEQRES 


2 


B 


728 


THR 


TYR 


ARG 


LEU 


SEQRES 


3 


B 


728 


ASP 


HIS 


GLU 


TYR 


SEQRES 


4 


B 


728 


VAL 


PHE 


ASN 


ALA 


SEQRES 


5 


B 


728 


GLU 


ASN 


SER 


THR 


SEQRES 


6 


B 


728 


ASP 


TYR 


SER 


ILE 


SEQRES 


7 


B 


728 


GLU 


TYR 


ASN 


TYR 


SEQRES 


8 


B 


728 


ALA 


SER 


TYR 


ASP 


SEQRES 


9 


B 


728 


ILE 


THR 


GLU 


GLU 


SEQRES 


10 


B 


728 


THR 


TRP 


SER 


PRO 


SEQRES 


11 


B 


728 


ASN 


ASN 


ASP 


ILE 


SEQRES 


12 


B 


728 


SER 


TYR 


ARG 


ILE 


SEQRES 


13 


B 


728 


TYR 


ASN 


GLY 


ILE 


SEQRES 


14 


B 


728 


PHE 


SER 


ALA 


TYR 


SEQRES 


15 


B 


728 


THR 


PHE 


LEU 


ALA 


SEQRES 


16 


B 


728 


PRO 


LEU 


ILE 


GLU 


SEQRES 


17 


B 


728 


GLN 


TYR 


PRO 


LYS 


SEQRES 


18 


B 


728 


GLY 


ALA 


VAL 


ASN 


SEQRES 


19 


B 


728 


THR 


ASP 


SER 


LEU 


SEQRES 


20 


B 


728 


GLN 


ILE 


THR 


ALA 


SEQRES 


21 


B 


728 


TYR 


LEU 


CYS 


ASP 


SEQRES 


22 


B 


728 


SER 


LEU 


GLN 


TRP 


SEQRES 


23 


B 


728 


MET 


ASP 


ILE 


CYS 


SEQRES 


24 


B 


728 


ASN 


CYS 


LEU 


VAL 


SEQRES 


25 


B 


728 


THR 


GLY 


TRP 


VAL 


SEQRES 


26 


B 


728 


PHE 


THR 


LEU 


ASP 


SEQRES 


27 


B 


728 


ASN 


GLU 


GLU 


GLY 


SEQRES 


28 


B 


728 


ASP 


LYS 


LYS 


ASP 


SEQRES 


29 


B 


728 


GLU 


VAL 


ILE 


GLY 


SEQRES 


30 


B 


728 


TYR 


TYR 


ILE 


SER 


SEQRES 


31 


B 


728 


ARG 


ASN 


LEU 


TYR 


SEQRES 


32 


B 


728 


VAL 


THR 


CYS 


LEU 


SEQRES 


33 


B 


728 


GLN 


TYR 


TYR 


SER 


SEQRES 


34 


B 


728 


TYR 


GLN 


LEU 


ARG 


SEQRES 


35 


B 


728 


THR 


LEU 


HIS 


SER 



ILE GLU ALA LEU THR SER ASP TYR LEU 
ASN GLU TYR LYS GLY MET PRO GLY GLY 
LYS ILE GLN LEU SER ASP TYR THR LYS 
SER CYS GLU LEU ASN PRO GLU ARG CYS 
VAL SER PHE SER LYS GLU ALA LYS TYR 
CYS SER GLY PRO GLY LEU PRO LEU TYR 
SER VAL ASN ASP LYS GLY LEU ARG VAL 
SER ALA LEU ASP LYS MET LEU GLN ASN 
SER LYS LYS LEU ASP PHE ILE ILE LEU 
PHE TRP TYR GLN MET ILE LEU PRO PRO 
SER LYS LYS TYR PRO LEU LEU LEU ASP 
PRO CYS SER GLN LYS ALA ASP THR VAL 
TRP ALA THR TYR LEU ALA SER THR GLU 
ALA SER PHE ASP GLY ARG GLY SER GLY 
LYS ILE MET HIS ALA ILE ASN ARG ARG 
GLU VAL GLU ASP GLN ILE GLU ALA ALA 
LYS MET GLY PHE VAL ASP ASN LYS ARG 
GLY TRP SER TYR GLY GLY TYR VAL THR 
GLY SER GLY SER GLY VAL PHE LYS CYS 
ALA PRO VAL SER ARG TRP GLU TYR TYR 
THR GLU ARG TYR MET GLY LEU PRO THR 
LEU ASP HIS TYR ARG ASN SER THR VAL 
GLU ASN PHE LYS GLN VAL GLU TYR LEU 
THR ALA ASP ASP ASN VAL HIS PHE GLN 
ILE SER LYS ALA LEU VAL ASP VAL GLY 
ALA MET TRP TYR THR ASP GLU ASP HIS 
SER THR ALA HIS GLN HIS ILE TYR THR 
PHE ILE LYS GLN CYS PHE SER LEU PRO 
TYR THR LEU THR ASP TYR LEU LYS ASN 
LYS LEU TYR SER LEU ARG TRP ILE SER 
LEU TYR LYS GLN GLU ASN ASN ILE LEU 
GLU TYR GLY ASN SER SER VAL PHE LEU 
PHE ASP GLU PHE GLY HIS SER ILE ASN 
SER PRO ASP GLY GLN PHE ILE LEU LEU 
VAL LYS GLN TRP ARG HIS SER TYR THR 
ILE TYR ASP LEU ASN LYS ARG GLN LEU 
ARG ILE PRO ASN ASN THR GLN TRP VAL 
VAL GLY HIS LYS LEU ALA TYR VAL TRP 
TYR VAL LYS ILE GLU PRO ASN LEU PRO 
THR TRP THR GLY LYS GLU ASP ILE ILE 
THR ASP TRP VAL TYR GLU GLU GLU VAL 
SER ALA LEU TRP TRP SER PRO ASN GLY 
TYR ALA GLN PHE ASN ASP THR GLU VAL 
TYR SER PHE TYR SER ASP GLU SER LEU 
THR VAL ARG VAL PRO TYR PRO LYS ALA 
PRO THR VAL LYS PHE PHE VAL VAL ASN 
SER SER VAL THR ASN ALA THR SER ILE 
PRO ALA SER MET LEU ILE GLY ASP HIS 
VAL THR TRP ALA THR GLN GLU ARG ILE 
LEU ARG ARG ILE GLN ASN TYR SER VAL 
ASP TYR ASP GLU SER SER GLY ARG TRP 
ALA ARG GLN HIS ILE GLU MET SER THR 
GLY ARG PHE ARG PRO SER GLU PRO HIS 
GLY ASN SER PHE TYR LYS ILE ILE SER 
TYR ARG HIS ILE CYS TYR PHE GLN ILE 
CYS THR PHE ILE THR LYS GLY THR TRP 
ILE GLU ALA LEU THR SER ASP TYR LEU 
ASN GLU TYR LYS GLY MET PRO GLY GLY 
LYS ILE GLN LEU SER ASP TYR THR LYS 
SER CYS GLU LEU ASN PRO GLU ARG CYS 
VAL SER PHE SER LYS GLU ALA LYS TYR 
CYS SER GLY PRO GLY LEU PRO LEU TYR 
SER VAL ASN ASP LYS GLY LEU ARG VAL 
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B 
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B 
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B 
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A 
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A 
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HET 


NAG 


B 


794 


14 
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Nl 05 


FORMUL 


7 


NAG 


C8 H15 


Nl 05 


FORMUL 


8 


NAG 


C8 H15 
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12 
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13 
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HG 
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FORMUL 


15 


HG 
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FORMUL 


16 


HG 


HG1 ++ 






FORMUL 


17 


HG 


HG1 ++ 






FORMUL 


18 


HG 


HG1 ++ 






FORMUL 


19 


HG 


HG1 ++ 






FORMUL 


20 


HOH 


*322 (H2 


Ol) 




CRYST1 


65.496 


68.240 


419. 


289 



SER 


ALA 


LEU 


ASP 
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VAL 
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VAL 
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ARG 


TRP 
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TYR 
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MET 
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PRO 
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ASP 
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VAL 
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THR 
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ASN 
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HIS 


PHE 


GLN 
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LYS 
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LEU 


VAL 


ASP 
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GLY 
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MET 
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TYR 


THR 


ASP 
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ASP 


HIS 


SER 


THR 
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HIS 


GLN 


HIS 


ILE 
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ILE 
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GLN 


CYS 


PHE 
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LEU 


PRO 



90.00 90.00 90.00 P 21 21 21 4 



Cl.l 234 5 6 7 8 9 10 
Atom No. Atom Aa Aa No. X Y Z OCC B 
type type factor 



ATOM 


1 


N 


SER 


A 


39 


81. 


432 


37. 


048 


22. 


064 


1 


.00 


53 


.58 


ATOM 


2 


CA 


SER 


A 


39 


81. 


906 


38. 


278 


21. 


379 


1 


.00 


53 


.40 


ATOM 


3 


C 


SER 


A 


39 


82. 


622 


39. 


311 


22. 


300 


1 


.00 


53 


.71 


ATOM 


4 


O 


SER 


A 


39 


82. 


300 


40. 


493 


22. 


268 


1 


.00 


54 


.99 


ATOM 


5 


CB 


SER 


A 


39 


80. 


683 


38. 


903 


20. 


729 


1 


.00 


53 


.66 


ATOM 


6 


OG 


SER 


A 


39 


79. 


738 


37. 


881 


20. 


418 


1 


.00 


51 


.39 
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ATOM 


7 


N 


ARG 


A 


40 


83. 


591 


38. 


872 


23. 


109 


1. 


00 


53. 


25 


ATOM 


8 


CA 


ARG 


A 


40 


84 . 


264 


39. 


750 


24 . 


098 


1. 


00 


52. 


31 


ATOM 


9 


C 


ARG 


A 


40 


83. 


601 


40. 


490 


25. 


265 


1. 


00 


50. 


54 


ATOM 


10 


O 


ARG 


A 


40 


83. 


314 


39. 


903 


26. 


298 


1. 


00 


49. 


96 


ATOM 


11 


CB 


ARG 


A 


40 


85. 


768 


39. 


965 


23. 


920 


1. 


00 


52. 


68 


ATOM 


12 


CG 


ARG 


A 


40 


86. 


628 


38. 


946 


24 . 


740 


1. 


00 


54 . 


78 


ATOM 


13 


CD 


ARG 


A 


40 


85. 


794 


37. 


897 


25. 


546 


1. 


00 


57. 


24 


ATOM 


14 


NE 


ARG 


A 


40 


86. 


328 


36. 


534 


25. 


499 


1. 


00 


58. 


28 


ATOM 


15 


CZ 


ARG 


A 


40 


85. 


660 


35. 


448 
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65.891 71.404 10.679 1.00 56.01 

67.215 71.747 9.995 1.00 56.92 

67.374 72.851 9.475 1.00 58.85 

68.167 70.801 9.994 1.00 55.37 

65.580 74.568 9.641 1.00 57.67 

65.998 75.956 9.812 1.00 59.12 

64.889 77.022 9.927 1.00 59.80 

65.141 78.116 10.453 1.00 59.65 

66.904 76.360 8.639 1.00 59.16 

66.789 75.443 7.558 1.00 60.69 
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62.608 77.720 9.370 1.00 61.40 
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68.432 78.630 11.613 1.00 68.66 

64.725 81.968 14.197 1.00 67.56 

64.459 83.209 14.936 1.00 67.97 

63.127 83.189 15.714 1.00 67.64 

62.552 84.234 16.000 1.00 67.78 

64.515 84.432 13.994 1.00 67.97 

65.920 84.765 13.490 1.00 68.74 

66.349 86.191 13.828 . 1.00 69.30 

66.456 86.505 15.033 1.00 68.09 

66.582 87.001 12.895 1.00 70.23 

62.648 82.005 16.066 1.00 67.34 

61.422 81.898 16.839 1.00 67.25 

61.657 82.331 18.272 1.00 66.90 

60.790 82.911 18.906 1.00 66.31 

60.933 80.464 16.857 1.00 67.37 

59.548 80.311 17.401 1.00 67.61 

58.468 80.818 16.715 1.00 68.05 

59.325 79.663 18.597 1.00 67.98 

57.190 80.674 17.211 1.00 68.39 
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43. 


75 


HETATM1224 6 


0 


HOH 


284 


60. 


509 


67 . 


789 


68. 


395 


1 


.00 


22. 


45 


HETATM12247 


0 


HOH 


285 


57. 


398 


66. 


633 


70. 


380 


1 


.00 


3.3. 


83 


HETATM1224 8 


0 


HOH 


286 


58. 


553 


64 . 


183 


70. 


306 


1 


.00 


45. 


07 


HETATM1224 9 


o 


HOH 


287 


28. 


754 


79. 


787 


24. 


414 


1 


.00 


41. 


13 


HETATM12250 


0 


HOH 


288 


27. 


759 


71. 


284 


45. 


936 . 


1 


.00 


47. 


91 


HETATM12251 


0 


HOH 


289 


23. 


927 


72. 


799 


35. 


757 


1 


.00 


51. 


30 


HETATM12252 


o 


HOH 


290 


29. 


955 


73. 


971 


39. 


463 


1 


.00 


36. 


46 


HETATM12253 


0 


HOH 


291 


25. 


.897 


53. 


293 


41. 


801 


1 


.00 


33. 


14 


HETATM1225 4 


0 


HOH 


292 


23. 


.797 


50. 


547 


38. 


975 


1 


.00 


31. 


04 


HETATM12255 


0 


HOH 


293 


26. 


779 


49. 


888 


39. 


145 


1 


.00 


36. 


09 


HETATM1225 6 


o 


HOH 


294 


27. 


.839 


58. 


.254 


37. 


402 


1 


.00 


26. 


51 


HETATM12257 


o 


HOH 


295 


29. 


.803 


58. 


,215 


43. 


171 


. 1 


.00 


23. 


11 


HETATM12258 


o 


HOH 


296 . 


29. 


.469 


60. 


Oil 


. 41. 


576 


1 


.00 


34: 


44 


HETATM12259 


0 


HOH 


297 


32. 


.193 


40. 


.552 


38. 


.804 


1 


.00 


45. 


78 


HETATM122 60 


0 


HOH 


298 


33. 


.709 


34 . 


.220 


29. 


.537 


1 


.00 


32. 


56 


HETATM122 61 


0 


HOH 


299 


39. 


.324 


47. 


. 614 


21. 


.483 


1 


.00 


33. 


.19 
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HETATM12262 


0 


HOH 


300 


33. 


791 


44 : 


525 


25. 


455 


1 . 


00 


34. 


40 


HETATM12263 


O 


HOH 


301 


34 . 


210 


32. 


867 


17. 


969 


1. 


00 


23. 


95 


HETATM12264 


O 


HOH 


302 


23. 


518 


42. 


390 


14 . 


824 


1. 


00 


33. 


39 


HETATM12265 


O 


HOH 


303 


28. 


153 


45. 


492 


6. 


361 


1 . 


00 


30. 


26 


HETATM12266 


O 


HOH 


304. 


26. 


608 


48. 


522 


7 . 


079 


1 . 


00 


29. 


68 


HETATM12267 


0 


HOH 


305 


38. 


605 


48. 


045 


-0. 


774 


1. 


00 


48. 


54 


HETATM12268 


0 


HOH 


306 


36. 


442 


48. 


639 


-1. 


382 


1. 


00 


51. 


66 


HETATM12269 


0 


HOH 


307 


33. 


276 


49. 


992 


5 . 


200 


1 . 


00 


34 . 


73 


HETATM12270 


O 


HOH 


308 


34. 


560 


28. 


406 


-1. 


463 


1 . 


00 


56. 


41 


HETATM1227 1 


0 


HOH 


309 


46. 


509 


52. 


025 


11. 


464 


1. 


00 


23. 


72 


HETATM12272 


0 


HOH 


310 


40. 


013 


51. 


475 


8. 


495 


1. 


00 


39. 


95. 


HETATM12273 


0 


HOH 


311 


63. 


562 


52. 


804 


2. 


547 


1. 


00 


38. 


56 


HETATM1227 4 


0 


HOH 


312 


66. 


967 


44 . 


809 


5. 


191 


1. 


00 


43. 


64 


HETATM12275 


O 


HOH 


313 


76. 


726 


33. 


117 


24 . 


145 


1. 


00 


31. 


10 


HETATM1227 6 


O 


HOH 


314 


45. 


201 


27. 


566 


28. 


129 


1. 


00 


32. 


65 


HETATM12277 


0 


HOH 


315 


62. 


406 


37. 


653 


31. 


681 


1. 


00 


33. 


49 


HETATM12278 


0 


HOH 


316 


67. 


033 


50. 


301 


26. 


622 


1. 


00 


28. 


28 


HETATM12279 


o 


HOH 


317 


48. 


216 


37. 


093 


36. 


293 


1. 


00 


27. 


89 


HETATM12280 


o 


HOH 


318 


36. 


680 


27. 


536 


26. 


666 


1 . 


00 


43. 


58 


HETATM1228 1 


o 


HOH 


319 


42. 


690 


28. 


000 


29. 


436 


1. 


00 


29. 


76 


HETATM12282 


0 


HOH 


320 


47 . 


256 


39. 


106 


52. 


493 


1 . 


00 


27. 


93 


HETATM12283 


o 


HOH 


321 


58. 


126 


34 . 


638 


53. 


518 


1 . 


00 


32. 


97 


HETATM12284 


o 


HOH 


322 


64. 


011 


42. 


183 


54 . 


777 


1 . 


00 


26. 


52 


HETATM12285 


o 


HOH 


323 


57. 


427 


64 . 


632 


46. 


535 


1 . 


00 


24. 


28 


HETATM12286 


o 


HOH 


324 


56. 


723 


63. 


053 


51. 


391 


1. 


00 


23. 


, 85 


HETATM12287 


0 


HOH 


325 


67. 


474 


64. 


172 


35. 


795 


1. 


00 


26. 


,71 


HETATM12288 


o 


HOH 


326 


65. 


117 


63. 


674 


33. 


106 


1. 


00 


34 . 


, 74 


HETATM12289 


o 


HOH 


327 


77. 


532 


52. 


988 


43. 


002 


1. 


00 


35. 


.17 


HETATM12290 


o 


HOH 


328 


73. 


665 


41. 


787 


70. 


523 


1. 


00 


23. 


.80 


HETATM12291 


o 


HOH 


329 


74 . 


243 


39. 


155 


71. 


502 


1 . 


00 


37. 


.50 


HETATM122 92 


o 


HOH 


330 


65. 


915 


51. 


647 


74 . 


886 


1. 


00 


33. 


.80 


HETATM122 93 


o 


HOH 


331 


63. 


198 


51. 


539 


76. 


002 


1. 


00 


44 . 


.63 


HETATM12294 


0 


HOH 


332 


68. 


57 9 


56. 


719 


74 . 


627 


1. 


00 


36. 


.47 


HETATM12295 


0 


HOH 


333 


62. 


332 


54 . 


612 


89. 


660 


1 . 


00 


36. 


. 15 


HETATM122 96 


0 


HOH 


334 


59. 


454 


68. 


706 


Ill . 


542 


1. 


00 


31. 


.80 


HETATM12297 


o 


HOH 


335 


53. 


783 


65. 


446 


77 . 


107 


1. 


00 


35. 


.33 


HETATM12298 


o 


HOH 


336 


52. 


096 


74 . 


528 


87. 


111 


1 . 


00 


54 . 


.53 


HETATM122 99 


o 


HOH 


337 


53. 


792 


79. 


518 


82. 


367 


1 . 


00 


42, 


.24 


HETATM12300 


o 


HOH 


338 


45. 


757 


92. 


494 


97. 


309 


1. 


00 


39, 


.29 


HETATM12301 


o 


HOH 


339 


39. 


, 105 


56. 


,189 


55. 


,7 67 


1 . 


00 


27, 


. 64 


HETATM12302 


0 


HOH 


340 


43. 


, 199 


92. 


659 


61. 


,430 


1 . 


00 


37 


.84 


HETATM12303 


o 


HOH 


341 


53. 


, 836 


85. 


. 197 


55. 


,803 


1. 


00 


33 


.80 


HETATM12304 


o 


HOH 


342 


53. 


.706 


94 . 


.980 


73, 


.302 


1. 


00 


28 


. 63 


HETATM12305 


0 


HOH 


343 


51. 


.760 


94 . 


.004 


75, 


.045 


1.. 


,00 


33 


. 39 


HETATM12306 


o 


HOH 


34 4 


58, 


.030 


89. 


. 168 


61, 


.516 


1 . 


,00 


25 


. 90 


HETATM12307 


o 


HOH 


345 


50. 


. 970 


107. 


.755 


84 . 


. 519 


1 . 


,00 


48 


.79 


HETATM12308 


o 


HOH 


346 


64 . 


.514 


83. 


. 981 


93. 


. 646 


1 . 


.00 


31 


.88 


HETATM12309 


o 


HOH 


347 


80. 


.236 


91. 


. 940 


81. 


.786 


1. 


.00 


40 


. 68 


HETATM12310 


o 


HOH 


348 


75. 


.328 


85. 


.599 


104 . 


.775 


1, 


.00 


24 


.56 


HETATM12311 


o 


HOH 


349 


79, 


.517 


79. 


.180 


102. 


.402 


1 . 


.00 


25 


.88 


HETATM12312 


o 


HOH 


350 


80. 


.747 


63 


.743 


95, 


.369 


1. 


.00 


27 


.89 


HETATM12313 


0 


HOH 


351 


64 


. 969 


77 


. 943 


106 


.765 


1. 


.00 


44 


.27 


HETATM12314 


o 


HOH 


352 


95 


.965 


80 


.561 


67 


.682 


1 , 


.00 


44 


.80 


HETATM12315 


o 


HOH 


353 


86 


.914 


49 


.199 


79 


.546 


1, 


.00 


28 


.91 


HETATM12316 


o 


HOH 


354 


77 


.363 


57 


.263 


75 


.67 9 


1 , 


.00 


24 


.96 


HETATM12317 


o 


HOH 


355 


78 


.207 


53 


.138 


78 


.606 


1 


.00 


31 


.72 


HETATM12318 


0 


HOH 


356 


87 


.975 


68. 


.747 


54 


.296 


1 


.00 


50 


.80 
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HETATM1231 9 


0 


HOH 




357 


85. 


047 


62. 


868 


55. 


786 


1. 


00 


31. 


85 


HETATM12320 


0 


HOH 




358 


86. 


034 


61. 


805 


52. 


552 


1. 


00 


31. 


60 


HETATM12321 


0 


HOH 




359 . 


79. 


445 


74 . 


128 


45. 


275 


1. 


00. 


33. 


57. 


HETATM12322 


O 


HOH 




360 


. 56. 


053 


54 . 


524 


102. 


588 


1.. 


00171. 


18 


HETATM12323 


O 


HOH 




361 


48. 


029 


63. 


170 


110. 


923 


1. 


00 


47 . 


18 


HETATM12324 


O 


HOH 




362 


51. 


605 


65. 


693 


106. 


423 


1 . 


00 


47 . 


01 


HETATM12325 


O 


HOH 




363 


50. 


673 


68. 


039 


105. 


495 


1. 


00 


46. 


13 


HETATM1232 6 


O 


HOH 




364 


94. 


322 


44 . 


608 


. 67. 


120 


1. 


00 


45. 


40 


HETATM12327 


O 


HOH 




365 


86. 


923 


4.3'. 


646 


74 . 


686 


1. 


00 


33. 


81 


HETATM12328 


O 


HOH 




366 


79. 


642 


38. 


900 


69. 


678 


1. 


00 


50.. 


27 


HETATM12329 


O 


HOH 




367 


67. 


633 


24. 


490 


28. 


602 


1. 


00 


52. 


13 


HETATM12330 


o 


HOH 




368 


54 . 


251 


58. 


966 


34 . 


469 


1. 


00 


43. 


40 


HETATM12331 


o 


HOH 




369 


51 . 


371 


57. 


464 


36. 


619 


1. 


00 


24 . 


00 


HETATM12332 


o 


HOH 




370 


59. 


016 


48. 


135 


34 . 


799 


1. 


00 


40. 


23 


HETATM12333 


o 


HOH 




371 


34 . 


879 


31. 


553 


9. 


868 


1. 


00 


44 . 


94 


HETATM12334 


o 


HOH 




372 


27. 


580 


41. 


566 


39. 


476 


1. 


00 


42. 


65 


HETATM12335 


o 


HOH 




373 


24. 


846 


42. 


734 


35. 


,135 


1. 


00 


51. 


21 


HETATM12336 


o 


HOH 




374 


19. 


556 


46. 


158 


34. 


315 


1 . 


00. 


53. 


49 


HETATM12337 


o 


HOH 




375 


83. 


691 


70. 


175 


77 . 


027 


1. 


00 


28. 


28 


HETATM12338 


0 


HOH 




376 


74. 


,717 


68. 


866 


78. 


,833 


1 . 


00 


37. 


76 


HETATM12339 


o 


HOH 




377 


76. 


,631 


68. 


088 


80. 


.362 


1 . 


00 


29. 


47 


HETATM1234 0 


o 


HOH 




378 


58. 


,860 


55. 


128 


1. 


.822 


1. 


00 


30. 


66 


HETATM1234 1 


o 


HOH 




379 


62. 


,809 


55. 


151 


-3. 


.277 


1. 


00 


40. 


41 


HETATM12342 


0 


HOH 




380 


33. 


,273 


62. 


466 


49. 


.352 


1 . 


00 


41. 


97 


HETATM1234 3 


0 


HOH 




381 


28. 


.588 


59. 


352 


49. 


. 926 


1 . 


00 


41. 


32 


HETATM1234 4 


o 


HOH 




382 


30. 


.906 


56. 


703 


47 . 


. 696 


1 . 


00 


41. 


49 


HETATM1234 5 


o 


HOH 




383 


35, 


.506 


55. 


437 


50. 


.284 


1 . 


00 


36. 


64 


HETATM1234 6 


0 


HOH 




384 


87. 


.842 


80. 


426 


66. 


. 401 


1. 


00 


43. 


96 


HETATM1234 7 


0 


HOH 




385 


86. 


.490 


79. 


913 


76. 


.221 


1. 


00 


32. 


49 


HETATM1234 8 


o 


HOH 




386 


84. 


.867 


74 . 


141 


57 . 


. 146 


1 . 


00 


40. 


97 


HETATM1234 9 


o 


HOH 




387 


82. 


.643 


79. 


545 


52. 


. 006 


1. 


00 


50. 


83 


HETATM12350 


o 


HOH 




388 


68, 


.042 


83. 


874 


47. 


. 140 


1. 


00 


49. 


08 


HETATM12351 


o 


HOH 




389 


52, 


.056 


92. 


832 


68. 


.966 


1. 


00 


43. 


35 


HETATM12352 


o 


HOH 




390 


54 , 


.797 


93. 


224 


71, 


. 916 


1 . 


00 


40. 


36 


HETATM12353 


o 


HOH 




391 


57, 


.293 


91. 


228 


67 . 


.268 


1 . 


00 


24 . 


55 


HETATM12354 


0 


HOH 




392 


56 


.898 


89. 


074 


65. 


. 525 


1 . 


00 


32. 


59 


HETATM12355 


0 


HOH 




393 


55 


.335 


90. 


968 


. 68. 


.860 


1. 


00 


25. 


76 


HETATM12356 


o 


HOH 




394 


56 


.153 


89. 


417 


.62. 


.819 


1 . 


00 


30. 


99 


HETATM12357 


o 


HOH 




395 


59 


.579 


102. 


134 


76 


.705 


1. 


00 


40. 


70 


HETATM12358 


0 


HOH 




396. 


61 


.841 


100. 


995 


: 93 


. 173 


1. 


00 


50. 


46 


HETATM12359 


0 


HOH 




397 


71 


.154 


98. 


292 


81 


.895 


1. 


00 


31. 


13 


HETATM12360 


o 


HOH 




398 


75 


.477 


93. 


747 


78 


.061 


1. 


00 


36. 


92 


HETATM12361 


o 


HOH 




399 


79 


.703 


89. 


990 


74 


.196 


1. 


00 


49. 


92 


HETATM12362 


o 


HOH 




400 


85 


. 642 


70. 


,504 


75 


.265 


.1. 


00 


34 . 


21 


HETATM12363 


HG 


HG 


Y 


303 


. 42 


.410 


4 3.. 


,821 


. 32 


.702 


1. 


00 


59. 


73 


HETATM12364 


HG 


HG 


Y 


301 


35 


.399 


52. 


,819 


. 33 


. 178 


1 . 


00 


65. 


74 


HETATM123 65 


HG 


HG 


Y 


302 


3.6 


.321 


52. 


.198 


' 31 


.093 


1 . 


oo: 


L03. 


48 


HETATM12366 


HG 


HG 


Z 


303 


73 


.145 


77 . 


.979 


72 


.298 


1. 


00 


63. 


81 


HETATM12367 


HG 


HG 


Z 


301 


63 


.582 


84 . 


.279 


71 


.535 


1 . 


00 


65. 


41 


HETATM12368 


HG 


HG 


Z 


302 . 


64 


.171 


83. 


.832 


: 74 


.081 


1. 


oo: 


106. 


14 



END ■ _. :__ ; 

Column 2 lists a number for the atom in the structure. 

Column 3 lists the element whose coordinates are measured. The first letter in the column 

defines the element. 

Column 4 lists the type of amino acid. 

Column 5 lists a number for the amino acid in the structure. 
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Columns 6-8 list the crystallographic coordinates X, Y, and Z respectively. The crystallographic 
coordinates define the atomic position of the element measured. 

Column 9 lists an occupancy factor that refers to the fraction of the molecules in which each 

atom occupies the position specified by the coordinates. A value of "1" indicates that each atom 

has the same conformation, i. e., the same position, in all molecules of the crystal. 

Column 10 lists a thermal factor "B" that measures movement of the atom around its atomic 

center. 
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Sequence Listing 

<110> F.Hof fmann-LaRoche AG 

<120> Crystal structure of DPP-IV and its use 

<130> Case 21491 

<160> 2 

<170> Patentln version 3.1 

<210> 1 

<211> 2211 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> cDNA for aa 31 to 7 66 of DPP-IV 



<400> 1 



ggcacagatg 


atgctacagc 


tgacagtcgc 


aaaacttaca 


ctctaactga 


ttacttaaaa 


60 


aatacttata 


gactgaagtt 


atactcctta 


agatggattt 


cagatcatga 


atatctctac 


120 


aaacaagaaa 


ataatatctt 


ggtattcaat 


gctgaatatg 


gaaacagctc 


agttttcttg 


180 


gagaacagta 


catttgatga 


gtttggacat 


tctatcaatg 


attattcaat 


atctcctgat 


240 


gggcagttta 


ttctcttaga 


atacaactac 


gtgaagcaat 


ggaggcattc 


ctacacagct 


300 


tcatatgaca 


tttatgattt 


aaataaaagg 


cagctgatta 


cagaagagag 


gattccaaac 


360 


aacacacagt 


gggtcacatg 


gtcaccagtg 


ggtcataaat 


tggcatatgt 


ttggaacaat 


420 


gacatttatg 


ttaaaattga 


accaaattta 


ccaagttaca 


gaatcacatg 


gacggggaaa 


480 


gaagatataa 


tatataatgg 


aataactgac 


tgggtttatg 


aagaggaagt 


cttcagtgcc 


540 


tactctgctc 


tgtggtggtc 


tccaaacggc 


acttttttag 


catatgccca 


atttaacgac 


600 


acagaagtcc 


cacttattga 


atactccttc 


tactctgatg 


agtcactgca 


gtacccaaag 


660 


actgtacggg 


ttccatatcc 


aaaggcagga 


gctgtgaatc 


caactgtaaa 


gttctttgtt 


720 


gtaaatacag 


actctctcag 


ctcagtcacc 


aatgcaactt 


ccatacaaat 


cactgctcct 


780 


gcttctatgt 


tgatagggga 


tcactacttg 


tgtgatgtga 
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catgggcaac 


acaagaaaga 


840 



V 



atttctttgc 


agtggctcag 


gaggattcag 


aactattcgg 


tcatggatat 


ttgtgactat 


900 


gatgaatcca 


gtggaagatg 


gaaotgctta 


gtggcacggc 


aacacattga 


aatgagtact 


960 


actggctggg 


ttggaagatt 


taggccttca 


gaacctcatt 


ttacccttga 


tggtaatagc 


1020 


ttctacaaga 


tcatcagcaa 


tgaagaaggt 


tacagacaca 


tttgctattt 


ccaaatagat 


1080 


aaaaaagact 


gcacatttat 


tacaaaaggc 


acctgggaag 


tcatcgggat 


agaagctcta 


1140 


accagtgatt 


atctatacta 


cattagtaat 


gaatataaag 


gaatgccagg 


aggaaggaat 


1200 


ctttataaaa 


tccaacttat 


tgactataca 


aaagtgacat 


gcctcagttg 


tgagctgaat 


1260 


ccggaaaggt 


gtcagtacta 


ttctgtgtca 


ttcagtaaag 


aggcgaagta 


ttatcagctg 


1320 


agatgttccg 


gtcctggtct 


gcccctctat 


actctacaca 


gcagcgtgaa 


tgataaaggg 


1380 


ctgagagtcc 


tggaagacaa 


ttcagctttg 


gataaaatgc 


tgcagaatgt 


ccagatgccc 


1440 


tccaaaaaac 


tggacttcat 


tattttgaat 


gaaacaaaat 


tttggtatca 


gatgatcttg 


1500 


cctcctcatt 


ttgataaatc 


caagaaatat 


cctctactat 


tagatgtgta 


tgcaggccca 


1560 


tgtagtcaaa 


aagcagacac 


tgtcttcaga 


ctgaactggg 


ccacttacct 


tgcaagcaca 


1620 


gaaaacatta 


tagtagctag 


ctttgatggc 


agaggaagtg 


gttaccaagg 


agataagatc 


1680 


atgcatgcaa 


tcaacagaag 


actgggaaca 


tttgaagttg 


aagatcaaat 


tgaagcagcc 


1740. 


agacaatttt 


caaaaatggg 


atttgtggac 


aacaaacgaa 


ttgcaatttg 


gggctggtca 


1800 


tatggagggt 


acgtaacctc 


aatggtcctg 


ggatcgggaa 


gtggcgtgtt 


caagtgtgga 


I860. 


atagccgtgg 


cgcctgtatc 


ccggtgggag 


tactatgact 


cagtgtacac 


agaacgttac 


1920 


atgggtctcc 


caactccaga 


agacaacctt 


gaccattaca 


gaaattcaac 


agtcatgagc 


1980 


agagctgaaa 


attttaaaca 


agttgagtac 


ctccttattc 


atggaacagc 


agatgataac 


2040 


gttcactttc 


agcagtcagc 


tcagatctcc 


aaagccctgg 


tcgatgttgg 


agtggatttc 


2100 


caggcaatgt 


ggtatactga 


tgaagaccat 


ggaatagcta 


gcagcacagc 


acaccaacat 


2160 


atatataccc 


acatgagcca 


cttcataaaa 


caatgtttct 


ctttacctta 


g 


2211 
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<223> 



<400> 2 

Gly Thr Asp Asp Ala Thr Ala Asp Ser Arg- Lys Thr Tyr Thr Leu Thr 
1 5 . 10 15 

Asp Tyr Leu Lys Asn Thr Tyr Arg Leu Lys Leu Tyr Ser Leu Arg Trp 
20 25 30 

lie Ser Asp His Glu Tyr Leu Tyr Lys Gin Glu Asn Asn lie Leu Val 
35 40 45 

Phe Asn Ala Glu Tyr Gly Asn Ser Ser Val Phe Leu Glu Asn Ser Thr 
50 55-60 

Phe Asp Glu Phe Gly His Ser lie Asn Asp Tyr Ser lie Ser Pro Asp 
65 70 75 80 

Gly Gin Phe lie Leu Leu Glu Tyr Asn Tyr Val Lys Gin Trp Arg His 
85 90 " 95 

Ser Tyr Thr Ala Ser Tyr Asp lie Tyr Asp Leu Asn Lys Arg Gin Leu 
100 105 * 110 

lie Thr Glu Glu Arg lie Pro Asn Asn Thr Gin Trp Val Thr Trp Ser 
115 120 125 

Pro Val Gly His Lys Leu Ala Tyr Val Trp Asn Asn Asp lie Tyr Val 
130 135 140 

Lys lie Glu Pro Asn Leu Pro Ser Tyr Arg lie Thr Trp Thr Gly Lys 
145 150 155 160 

Glu Asp lie lie Tyr Asn Gly He Thr Asp Trp Val Tyr Glu Glu Glu 
165 170 175 

Val Phe Ser Ala Tyr Ser Ala Leu Trp Trp Ser Pro Asn Gly Thr Phe 
180 185 190 

Leu Ala Tyr Ala Gin Phe Asn Asp Thr Glu Val Pro Leu He Glu Tyr 
195 200. 205 

Ser Phe Tyr Ser Asp Glu Ser Leu Gin Tyr Pro Lys Thr Val Arg Val 
210 215 220 

Pro Tyr Pro Lys Ala Gly Ala Val Asn Pro Thr Val Lys Phe Phe Val 
225 230 235 240 

Val Asn Thr Asp Ser Leu Ser Ser Val Thr Asn Ala Thr Ser He Gin 
245 250 255 

He Thr Ala Pro Ala Ser Met Leu He Gly Asp His Tyr Leu Cys Asp 
260 265 270 

Val Thr Trp Ala Thr Gin Glu Arg He Ser Leu Gin Trp Leu Arg Arg 
275 280 285 

He Gin Asn Tyr Ser Val Met Asp He Cys Asp Tyr Asp Glu Ser Ser 
290 295. 300. 

Gly Arg Trp Asn Cys Leu Val Ala Arg Gin His He Glu Met Ser Thr 
305 310 315 320 
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Thr Gly Trp Val Gly Arg Phe Arg Pro Ser Glu Pro His Phe Thr Leu 
325 330. 335 

Asp Gly Asn Ser Phe Tyr Lys lie lie Ser Asn Glu Glu Gly Tyr Arg 
340 345 350 ■ 

His He Cys Tyr Phe Gin He Asp Lys Lys Asp Cys Thr Phe He Thr 
355 360 . 365 

Lys Gly Thr Trp Glu Val He Gly He Glu Ala Leu Thr Ser Asp Tyr 
370 375 380 

Leu Tyr Tyr He Ser Asn Glu Tyr Lys Gly Met Pro Gly Gly Arg Asn 
385 390 395 400 

Leu Tyr Lys He Gin Leu He Asp Tyr Thr Lys Val Thr Cys Leu Ser 
405 410 415 

Cys Glu Leu Asn Pro Glu Arg Cys Gin Tyr Tyr Ser Val Ser Phe Ser 
420 425 430 

Lys Glu Ala Lys Tyr Tyr Gin Leu Arg Cys Ser Gly Pro Gly Leu Pro 
435 440 445 

Leu Tyr Thr Leu His Ser Ser Val Asn Asp Lys Gly Leu Arg Val Leu 
450 455 460 

Glu Asp Asn Ser Ala Leu Asp Lys Met Leu Gin Asn Val Gin Met Pro 
465 470 475 480 

Ser Lys Lys Leu Asp Phe He He Leu Asn Glu Thr Lys Phe Trp Tyr 
485 490 495 

Gin Met He Leu Pro Pro His Phe Asp Lys Ser Lys Lys Tyr Pro Leu 
500 505 510 

Leu Leu Asp Val Tyr Ala Gly Pro Cys Ser Gin Lys Ala Asp Thr Val 
515 520 525 

Phe Arg Leu Asn Trp Ala Thr Tyr Leu Ala Ser Thr Glu Asn lie He 
530 535 540 

Val Ala Ser Phe Asp Gly Arg Gly Ser Gly Tyr Gin Gly Asp Lys lie 
545 550 555 560 

Met His Ala He Asn Arg Arg Leu Gly Thr Phe Glu Val Glu Asp Gin 
565 570 575 

He Glu Ala Ala Arg Gin Phe Ser Lys Met Gly Phe Val Asp Asn Lys 
580 585 590 

Arg He Ala He Trp Gly Trp. Ser Tyr Gly Gly Tyr Val Thr Ser Met 
595 600 605 

Val Leu Gly Ser Gly Ser Gly Val Phe Lys Cys Gly He Ala Val Ala 
610 615 620 

Pro Val Ser Arg Trp Glu Tyr Tyr Asp Ser' Val Tyr .Thr Glu Arg Tyr. 
625 630 \ 635 640 

Met Gly Leu Pro Thr Pro Glu Asp Asn Leu Asp His Tyr Arg Asn .Ser 
645 650 655 
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Thr Val Met Ser Arg 
660 

lie His Gly Thr Ala 
675 

lie Ser Lys Ala Leu 
690 

Tyr Thr Asp Glu Asp 
705 

He Tyr Thr His Met 
725 



Ala Glu Asn Phe Lys Gin 
665 

Asp Asp Asn Val His Phe 
680 

Val Asp Val Gly Val Asp 
695 

His Gly He Ala Ser Ser 
710 ' 715 

Ser His Phe He Lys Gin 
730 



Val Glu Tyr Leu Leu 
670 

Gin Gin Ser Ala Gin 
685 

Phe Gin Ala Met Trp 
700 

Thr Ala His Gin His 
720 

Cys Phe Ser Leu Pro 
735 
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